How much overhead does an Ajax request have - javascript

I'm working on a web app that needs to process large amounts of data from the server.
The data can be "streamed" and processed in chunks, so to make it faster I break the data up into packets and download each packet with ajax.
I use javascript promises to send the next ajax request right when the previous one receives it's data.
Each packet is about 300KB and there are normally 20 of them in total.
Now my question is, when I don't have the packets broken up (ie I download a single 6MB file) it takes my browser/network about 4 seconds to do it.
However when I break it up into packets it takes the browser about 8 seconds to download all the packets even though the file size is ultimately the same.
I expected there to be some overhead from each request sending new http headers ect... but to be twice as slow was quite a shock.
I tried moving the ajax requests onto a web worker thinking the main thread was possibly delaying them, but the same thing happens.
Is there anyway to speed up this process, or are there any javascript protocols that would keep the connection open.
I know the browser can do this with video streaming, but I wouldn't know how to use that protocol with binary packets.

I believe by default TCP connections in the browser are also subject to TCP congestion controls.
"Slow Start" for example ramps up the rate at which data is read/sent, so not to overwhelm the server and get a baseline of what traffic load the server can handle.
If you break up your 6MB request into many requests, it's possible your paying the "slow start" penalty on each request.
More info here:
You could try turning on keep-alive headers on the server and see if this improves things.


Why first network call takes more time than subsequent ones?

I am trying to understand this behavior where first network call takes more than double of subsequent ones. I know that DNS resolving will not take more than 5-50ms and it happens only in the initial call. Considering this info, there shouldn’t be much difference in time taken for the first call and subsequent calls.
I have tested this behavior with some famous URLs in separate incognito windows for each with cache disabled and attached a few screenshots to support my observation below. Can anyone help me understand this behavior?
Note: The readings are taken in full speed internet connection
Thanks in advance
After a few experiments, I found out that Content Download (browser request steps) part of the request is speeding up 1.5-2 times
This looks like a cause of TCP Slow Start algorithm
As it states:
modern browsers either open multiple connections simultaneously or reuse one connection for all files requested from a particular web server
That might be the reason for the first request to be slower than others
Also, #Vishal Vijay made a good addition:
Making initial connection handshake to the server is taking time (DNS Lookup + Initial connection + SSL). Browsers are creating Persistent Connections for HTTP requests and keep it open for some time. If any request came in for the same domain within that time, the browser will try to reuse the same connection for faster response.
In some cases it might be server-side cache mechanism causing subsequent request to process faster, but let's just talk about the browser-side stuffs.
When you hover on the waterfall 'blocks' you will get the time details:
Here is a quick reference for each of the phases (from Google Developers):
Queueing. The browser queues requests when:
There are higher priority requests.
There are already six TCP connections open for this origin, which is the limit. Applies to HTTP/1.0 and HTTP/1.1 only.
The browser is briefly allocating space in the disk cache
Stalled. The request could be stalled for any of the reasons described in Queueing.
DNS Lookup. The browser is resolving the request's IP address.
Proxy negotiation. The browser is negotiating the request with a proxy server.
Request sent. The request is being sent.
ServiceWorker Preparation. The browser is starting up the service worker.
Request to ServiceWorker. The request is being sent to the service worker.
Waiting (TTFB). The browser is waiting for the first byte of a response. TTFB stands for Time To First Byte. This timing includes 1
round trip of latency and the time the server took to prepare the
Content Download. The browser is receiving the response.
Receiving Push. The browser is receiving data for this response via HTTP/2 Server Push.
Reading Push. The browser is reading the local data previously received.
So what's difference between the first and subsequent requests in traditional HTTP/1.1 scenario?
DNS Lookup: It might take more time to resolve DNS for the first request. Subsequent requests will resolve a lot faster using browser DNS cache.
Waiting (TTFB): The first request have to establish TCP connecting to the server. Due to HTTP keep-alive mechanism, subsequent requests to the same server will reuse the existing TCP connection to prevent another TCP handshake, thus reducing three round-trip time compared the first request.
Content Download: Due to TCP slow start, the first request will need more time to download content. Since subsequent requests will reuse the TCP connection, when the TCP window scaled up, the content will be downloaded much faster than the first request.
Thus generally subsequent requests should be much faster than the first request. Actually this leads to a common network optimization strategy: Use as few domains as possible for your website.
HTTP/2 even introduces multiplexing to better reuse a single TCP connection. That's why HTTP/2 will give a performance boost in modern front end world, where we deploy tons of small assets on the CDN servers.

Prevent recursive calls of XmlHttpRequest to server

I've been googling for hours for this issue, but did not find any solution.
I am currently working on this app, built on Meteor.
Now the scenario is, after the website is opened and all the assets have been loaded in browser, the browser constantly makes recursive xhr calls to server. These calls are made at the regular interval of 25 seconds.
This can be seen in the Network tab of browser console. See the Pending request of the last row in image.
I can't figure out from where it originates, and why it is invoked automatically even when the user is idle.
Now the question is, How can I disable these automatic requests? I want to invoke the requests manually, i.e. when the menu item is selected, etc.
Any help will be appriciated.
In response to the Jan Dvorak's comment:
When I type "e" in the search box, the the list of events which has name starting with letter "e" will be displayed.
The request goes with all valid parameters and the Payload like this:
And this is the response, which is valid.
The code for this action is posted here
But in the case of automatic recursive requests, the request goes without the payload and the response is just a letter "h", which is strange. Isn't it? How can I get rid of this.?
Meteor has a feature called
Live page updates.
Just write your templates. They automatically update when data in the database changes. No more boilerplate redraw code to write. Supports any templating language.
To support this feature, Meteor needs to do some server-client communication behind the scenes.
Traditionally, HTTP was created to fetch dead data. The client tells the server it needs something, and it gets something. There is no way for the server to tell the client it needs something. Later, it became needed to push some data to the client. Several alternatives came to existence:
The client makes periodic requests to the server. The server responds with new data or says "no data" immediately. It's easy to implement and doesn't use much resources. However, it's not exactly live. It can be used for a news ticker but it's not exactly good for a chat application.
If you increase the polling frequency, you improve the update rate, but the resource usage grows with the polling frequency, not with the data transfer rate. HTTP requests are not exactly cheap. One request per second from multiple clients at the same time could really hurt the server.
hanging requests:
The client makes a request to the server. If the server has data, it sends them. If the server doesn't have data, it doesn't respond until it does. The changes are picked up immediately, no data is transferred when it doesn't need to be. It does have a few drawbacks, though:
If a web proxy sees that the server is silent, it eventually cuts off the connection. This means that even if there is no data to send, the server needs to send a keep-alive response anyways to make the proxies (and the web browser) happy.
Hanging requests don't use up (much) bandwidth, but they do take up memory. Nowadays' servers can handle multiple concurrent TCP connections, so it's less of an issue than it was before. What does need to be considered is the amount of memory associated with the threads holding on to these requests - especially when the connections are tied to specific threads serving them.
Browsers have hard limits on the number of concurrent requests per domain and in total. Again, this is less of a concern now than it was before. Thus, it seems like a good idea to have one hanging request per session only.
Managing hanging requests feels kinda manual as you have to make a new request after each response. A TCP handshake takes some time as well, but we can live with a 300ms (at worst) refractory period.
Chunked response:
The client creates a hidden iFrame with a source corresponding to the data stream. The server responds with an HTTP response header immediately and leaves the connection open. To send a message, the server wraps it in a pair of <script></script> tags that the browser executes when it receives the closing tag. The upside is that there's no connection reopening but there is more overhead with each message. Moreover, this requires a callback in the global scope that the response calls.
Also, this cannot be used with cross-domain requests as cross-domain iFrame communication presents its own set of problems. The need to trust the server is also a challenge here.
Web Sockets:
These start as a normal HTTP connection but they don't actually follow the HTTP protocol later on. From the programming point of view, things are as simple as they can be. The API is a classic open/callback style on the client side and the server just pushes messages into an open socket. No need to reopen anything after each message.
There still needs to be an open connection, but it's not really an issue here with the browser limits out of the way. The browser knows the connection is going to be open for a while, so it doesn't need to apply the same limits as to normal requests.
These seem like the ideal solution, but there is one major issue: IE<10 doesn't know them. As long as IE8 is alive, web sockets cannot be relied upon. Also, the native Android browser and Opera mini are out as well (ref.).
Still, web sockets seem to be the way to go once IE8 (and IE9) finally dies.
What you see are hanging requests with the timeout of 25 seconds that are used to implement the live update feature. As I already said, the keep-alive message ("h") is used so that the browser doesn't think it's not going to get a response. "h" simply means "nothing happens".
Chrome supports web sockets, so Meteor could have used them with a fallback to long requests, but, frankly, hanging requests are not at all bad once you've got them implemented (sure, the browser connection limit still applies).

Does using jQuery.get effectively double the ping time?

Suppose I have some script myScript.js that uses jQuery.get() to retrieve a small piece of data from the server. Suppose also that my ping time is horrible at 1500ms. Does using jQuery.get effectively double the ping time to 3000ms?
Or is there async magic that allows some sort of parallel processing? The reason I'm asking is that we use jQuery.get() fairly liberally and I'm wondering if it is an area we need to look at optimizing.
Edit: double compared to if I can somehow rearrange things to just load all the data upon the initial load and bypass jQuery get altogether
Ping time is usually server-related, where as jQuery is all client side. So the answer is no, it doesn't affect your ping time.
If you're asking if using jQuery.get (or ajax in general) can make your client side slower then the answer is that yes, the more JS you have then generally the slower the client gets if you're trying to process a lot of things since everything pretty much runs on the same thread. However, by default these ajax requests are asynchronous so until the server sends the response back the thread is usually idling anyways.
I'd suggest you open your page in Chrome and use the developer tools to see the network usage. That will tell you exactly how much time is taken 'waiting' on the server.
If you break down a request, you can get an idea of what latency you can expect.
Every TCP connection begins with a three-way-handshake:
SYN (client to server)
SYN-ACK (server to client)
ACK (client to server)
If the request fits in the size of one tcp packet (~1500 bytes) it can be sent as the last part of the handshake to optimize the network flow.
The response might be sent in just one packet as well (depending on its size). Once sent, both sides engage in a connection termination which takes two pairs of FIN-ACK sequences unless the connection is kept alive. At this point I'm not entirely sure whether the server can send FIN together with the last response packet.
So, in the best case scenario you can expect at least 2x ping time, but more likely 3-4x.

Is there any good trick for server to handle more requests if I don't have to sent any data back?

I want to handle a lot of (> 100k/sec) POST requests from javascript clients with some kind of service server. Not many of this data will be stored, but I have to process all of them so I cannot spend my whole server power for serving requests only. All the processing need to be done in the same server instance, otherwise I'll need to use database for synchronization between servers which will be slower by orders of magnitude.
However I don't need to send any data back to the clients, and they don't even expect them.
So far my plan was to create few proxy servers instances which will be able to buffer the request and send them to main server in bigger packs.
For example let's say that I need to handle 200k requests / sec and each server can handle 40k. I can split load between 5 of them. Then each one will be buffering requests and sending them back to main server in packs of 100. This will result in 2k requests / sec on the main server (however, each message will be 100 times bigger - which probably means around 100-200kB). I could even send them back to the server using UDP to decrease amount of needed resources (then I need only one socket on main server, right?).
I'm just thinking if there is no other way to speed up the things. Especially, when as I said I don't need to send anything back. I have full control over javascript clients also, but unlucky javascript is unable to send data using UDP which probably would be solution for me (I don't even care if 0.1% of data will be lost).
Any ideas?
Edit in response to answers given me so far.
The problem isn't with server being to slow at processing events from the queue or with putting events in the queue itself. In fact I plan to use disruptor pattern ( which was proven to process up to 6 million requests per second.
The only problem which I potentially can have is need to have 100, 200 or 300k sockets open at the same time, which cannot be handled by any of the mainstream servers. I know some custom solutions are possible ( but I'm wondering if there is no way to even better utilization of fact that I don't have to replay to clients.
(For example some way to embed part of the data in initial TCP packet and handle TCP packets as they would be UDP. Or some other kind of magic ;))
Make a unique and fast (probably in C) function that get's all requests, from a very fast server (like nginx). The only job of this function is to store the requests in a very fast queue (like redis if you got enought ram).
In another process (or server), depop the queue and do the real work, processing request one by one.
If you have control of the clients, as you say, then your proxy server doesn't even need to be an HTTP server, because you can assume that all of the requests are valid.
You could implement it as a non-HTTP server that simply sends back a 200, reads the client request until it disconnects, and then queues the requests for processing.
I think what you're describing is an implementation of a Message Queue. You also will need something to hand off these requests to whatever queue you use (RabbitMQ is quite good, there are many alternatives).
You'll also need something else running which can do whatever processing you actually want on the requests. You haven't made that very clear, so I'm not too sure exactly what would be right for you. Essentially the idea will be that incoming requests are dumped as quickly as simply as possible into the queue by your web server, and then the web server is free to go back to serving more requests. When the system has some resources, it uses them to process the queue, but when it's busy the queue just keeps growing.
Not sure what platform you're on, but might want to look at something like Lighttpd for serving the POSTs. You might (if same-domain restrictions don't shoot you down) get away with having Lighttpd running on a subdomain of your application (so Failing that you could put a proper load balancer in front of your webservers altogether (so all requests go to and the load balancer decides whether to forward them to the web server or the queue processor).
Hope that helps
Consider using MongoDB for persisting your requests, it's fire and forget mechanism can help your servers to response faster.

WebSockets: useful for reducing overhead?

I am building a dynamic search (updated with every keystroke): my current scheme is to, at each keystroke, send a new AJAX request to the server and get data back in JSON.
I considered opening a WebSocket for every search "session" in order to save some overhead.
I know that this will save time, but the question is, is it really worth it, considering those parameters:
80ms average ping time
166ms: time between each keystroke, assuming the user types relatively fast
A worst-case transfer rate of 1MB/s, with each data pack that has to be received on every keystroke being no more than 1KB.
The app also takes something like 30-40ms to weld the search results to the DOM.
I found this: HTTP vs Websockets with respect to overhead, but it was a different use case.
Will websockets reduce anything besides the pure HTTP overhead? How much is the HTTP overhead (assuming no cookies and minimal headers)?
I guess that HTTP requests open a new network socket on each request, while the WebSocket allows us to use only one all the time. If my understanding is correct, what is the actual overhead of opening a new network socket?
It seems like WebSockets provide a better performance in situations like yours.
Web Socked
small handshake header
full duplex communication after the handshake.
After the connection is established, only 2 bytes are added per transmitted request/response
Http headers are sent along with each request
On the other hand, WebSocket is a relatively new technology. It would be wise to investigate web browser support potential network related issues.

