I have a web application that listens for Server Sent Events. While I was working and testing with multiple windows open, things were not working and I banged my head for several times looking in the wrong direction: eventually, I realized that the problem was concurrent connections.
However I was testing a very limited number and even if I am running the test on Apache (I know, I should use node).
I then, switched browser and noticed something really interesting: apparently Chrome limits Server Sent Events connections to 4-5, while Opera doesn't. Firefox, on the other hand, after 4-5 simultaneous connections, refuses to load any other page.
What is the reason behind this? Does the limit only apply to SSE connections from the same source, or would it be the same if I were to test open them from a different domain? Is there any chance that I am misusing SSE and this is actually blocking the browsers, or this is a known behaviour? Is there any way around it?
The way this works in all browsers are that each domain gets a limited amount of connections and the limits are global for your whole application. That means if you have one connection open for realtime communication you have one less for loading images, CSS and other pages. On top of that you don't get new connections for new tabs or windows, all of them needs to share the same amount of connections. This is very frustrating but there are good reasons for limiting the connections. A few years back, this limit was 2 in all browsers (based on the rules in (http://www.ietf.org/rfc/rfc2616.txt) HTTP1.1 spec) but now most browsers use 4-10 connections in general. Mobile browsers on the other hand still needs to limit the amount of connections for battery saving purposes.
These tricks are available:
Use more host names. By assigning ex. www1.example.com, www2.example.com you get new connections for each host name. This trick works in all browsers. Don't forget to change the cookie domain to include the whole domain (example.com, not www.example.com)
Use web sockets. Web sockets are not limited by these restrictions and more importantly they are not competing with the rest of your websites content.
Reuse the same connection when you open new tabs/windows. If you have gathered all realtime communication logic to an object call Hub you can recall that object on all opened windows like this:
window.hub = window.opener ? window.opener.hub || new Hub()
4. or use flash - not quite the best advice these days but it might still be an option if websockets aren't an option.
5. Remember to add a few seconds of time between each SSE request to let queued requests to be cleared before starting a new one. Also add a little more waiting time for each second the user is inactive, that way you can concentrate your server resources on those users that are active. Also add a random number of delay to avoid the Thundering Herd Problem
Another thing to remember when using a multithreaded and blocking language such as Java or C# you risk using resources in your long polling request that are needed for the rest of your application. For example in C# each request locks the Session object which means that the whole application is unresponsive during the time a SSE request is active.
NodeJs is great for these things for many reasons as you have already figured out and if you were using NodeJS you would have used socket.io or engine.io that takes care of all these problems for you by using websockets, flashsockets and XHR-polling and also because it is non blocking and single threaded which means it will consume very little resources on the server when it is waiting for things to send. A C# application consumes one thread per waiting request which takes at least 2MB of memory just for the thread.
One way to get around this issue is to shut down the connections on all the hidden tabs, and reconnect when the user visits a hidden tab.
I'm working with an application that uniquely identifies users which allowed me to implement this simple work-around:
When users connect to sse, store their identifier, along with a timestamp of when their tab loaded. If you are not currently identifying users in your app, consider using sessions & cookies.
When a new tab opens and connects to sse, in your server-side code, send a message to all other connections associated with that identifier (that do not have the current timestamp) telling the front-end to close down the EventSource. The front-end handler would look something like this:
myEventSourceObject.addEventListener('close', () => {
myEventSourceObject.close();
myEventSourceObject = null;
});
Use the javascript page visibility api to check to see if an old tab is visible again, and re-connect that tab to the sse if it is.
document.addEventListener('visibilitychange', () => {
if (!document.hidden && myEventSourceObject === null) {
// reconnect your eventsource here
}
});
If you set up your server code like step 2 describes, on re-connect, the server-side code will remove all the other connections to the sse. Hence, you can click between your tabs and the EventSource for each tab will only be connected when you are viewing the page.
Note that the page visibility api isn't available on some legacy browsers:
https://caniuse.com/#feat=pagevisibility
2022 Update
This problem has been fixed in HTTP/2.
According to mozilla docs:-
When not used over HTTP/2, SSE suffers from a limitation to the maximum number of open connections, which can be especially painful when opening multiple tabs, as the limit is per browser and is set to a very low number (6).
The issue has been marked as "Won't fix" in Chrome and Firefox.
This limit is per browser + domain, which means that you can open 6 SSE connections across all of the tabs to www.1.example and another 6 SSE connections to www.2.example (per Stackoverflow).
When using HTTP/2, the maximum number of simultaneous HTTP streams is negotiated between the server and the client (defaults to 100).
Spring Boot 2.1+ ships by default with Tomcat 9.0.x which supports HTTP/2 out of the box when using JDK 9 or later.
If you are using any other backend, please enable http/2 to fix this issue.
You are right about the number of simultaneous connections.
You can check this list for max values: http://www.browserscope.org/?category=network
And unfortunately, I never found any work around, except multiplexing and/or using different hostnames.
Related
I've a web sockets based chat application (HTML5).
Browser opens a socket connection to a java based web sockets server over wss.
When browser connects to server directly (without any proxy) everything works well.
But when the browser is behind an enterprise proxy, browser socket connection closes automatically after approx 2 minutes of no-activity.
Browser console shows "Socket closed".
In my test environment I have a Squid-Dansguardian proxy server.
IMP: this behaviour is not observed if the browser is connected without any proxy.
To keep some activity going, I embedded a simple jquery script which will make an http GET request to another server every 60 sec. But it did not help. I still get "socket closed" in my browser console after about 2 minutes of no action.
Any help or pointers are welcome.
Thanks
This seems to me to be a feature, not a bug.
In production applications there is an issue related with what is known as "half-open" sockets - see this great blog post about it.
It happens that connections are lost abruptly, causing the TCP/IP connection to drop without informing the other party to the connection. This can happen for many different reasons - wifi signals or cellular signals are lost, routers crash, modems disconnect, batteries die, power outages...
The only way to detect if the socket is actually open is to try and send data... BUT, your proxy might not be able to safely send data without interfering with your application's logic*.
After two minutes, your Proxy assume that the connection was lost and closes the socket on it's end to save resources and allow new connections to be established.
If your proxy didn't take this precaution, on a long enough timeline all your available resources would be taken by dropped connections that would never close, preventing access to your application.
Two minutes is a lot. On Heroku they set the proxy for 50 seconds (more reasonable). For Http connections, these timeouts are often much shorter.
The best option for you is to keep sending websocket data within the 2 minute timeframe.
The Websocket protocol resolves this issue by implementing an internal ping mechanism - use it. These pings should be sent by the server and the browser responds to them with a pong directly (without involving the javascript application).
The Javascript API (at least on the browser) doesn't let you send ping frames (it's a security thing I guess, that prevents people from using browsers for DoS attacks).
A common practice by some developers (which I think to be misconstructed) is to implement a JSON ping message that is either ignored by the server or results in a JSON pong.
Since you are using Java on the server, you have access to the Ping mechanism and I suggest you implement it.
I would also recommend (if you have control of the Proxy) that you lower the timeout to a more reasonable 50 seconds limit.
* The situation during production is actually even worse...
Because there is a long chain of intermediaries (home router/modem, NAT, ISP, Gateways, Routers, Load Balancers, Proxies...) it's very likely that your application can send data successfully because it's still "connected" to one of the intermediaries.
This should start a chain reaction that will only reach the application after a while, and again ONLY if it attempts to send data.
This is why Ping frames expect Pong frames to be returned (meaning the chain of connection is intact.
P.S.
You should probably also complain about the Java application not closing the connection after a certain timeout. During production, this oversight might force you to restart your server every so often or experience a DoS situation (all available file handles will be used for the inactive old connections and you won't have room for new connections).
check the squid.conf for a request_timeout value. You can change this via the request_timeout. This will affect more than just web sockets. For instance, in an environment I frequently work in, a perl script is hit to generate various configurations. Execution can take upwards of 5-10 minutes to complete. The timeout value on both our httpd and the squid server had to be raised to compensate for this.
Also, look at the connect_timeout value as well. That's defaulted to one minute..
I've been googling for hours for this issue, but did not find any solution.
I am currently working on this app, built on Meteor.
Now the scenario is, after the website is opened and all the assets have been loaded in browser, the browser constantly makes recursive xhr calls to server. These calls are made at the regular interval of 25 seconds.
This can be seen in the Network tab of browser console. See the Pending request of the last row in image.
I can't figure out from where it originates, and why it is invoked automatically even when the user is idle.
Now the question is, How can I disable these automatic requests? I want to invoke the requests manually, i.e. when the menu item is selected, etc.
Any help will be appriciated.
[UPDATE]
In response to the Jan Dvorak's comment:
When I type "e" in the search box, the the list of events which has name starting with letter "e" will be displayed.
The request goes with all valid parameters and the Payload like this:
["{\"msg\":\"sub\",\"id\":\"8ef5e419-c422-429a-907e-38b6e669a493\",\"name\":\"event_Coll_Search_by_PromoterName\",\"params\":[\"e\"]}"]
And this is the response, which is valid.
a["{\"msg\":\"data\",\"subs\":[\"8ef5e419-c422-429a-907e-38b6e669a493\"]}"]
The code for this action is posted here
But in the case of automatic recursive requests, the request goes without the payload and the response is just a letter "h", which is strange. Isn't it? How can I get rid of this.?
Meteor has a feature called
Live page updates.
Just write your templates. They automatically update when data in the database changes. No more boilerplate redraw code to write. Supports any templating language.
To support this feature, Meteor needs to do some server-client communication behind the scenes.
Traditionally, HTTP was created to fetch dead data. The client tells the server it needs something, and it gets something. There is no way for the server to tell the client it needs something. Later, it became needed to push some data to the client. Several alternatives came to existence:
polling:
The client makes periodic requests to the server. The server responds with new data or says "no data" immediately. It's easy to implement and doesn't use much resources. However, it's not exactly live. It can be used for a news ticker but it's not exactly good for a chat application.
If you increase the polling frequency, you improve the update rate, but the resource usage grows with the polling frequency, not with the data transfer rate. HTTP requests are not exactly cheap. One request per second from multiple clients at the same time could really hurt the server.
hanging requests:
The client makes a request to the server. If the server has data, it sends them. If the server doesn't have data, it doesn't respond until it does. The changes are picked up immediately, no data is transferred when it doesn't need to be. It does have a few drawbacks, though:
If a web proxy sees that the server is silent, it eventually cuts off the connection. This means that even if there is no data to send, the server needs to send a keep-alive response anyways to make the proxies (and the web browser) happy.
Hanging requests don't use up (much) bandwidth, but they do take up memory. Nowadays' servers can handle multiple concurrent TCP connections, so it's less of an issue than it was before. What does need to be considered is the amount of memory associated with the threads holding on to these requests - especially when the connections are tied to specific threads serving them.
Browsers have hard limits on the number of concurrent requests per domain and in total. Again, this is less of a concern now than it was before. Thus, it seems like a good idea to have one hanging request per session only.
Managing hanging requests feels kinda manual as you have to make a new request after each response. A TCP handshake takes some time as well, but we can live with a 300ms (at worst) refractory period.
Chunked response:
The client creates a hidden iFrame with a source corresponding to the data stream. The server responds with an HTTP response header immediately and leaves the connection open. To send a message, the server wraps it in a pair of <script></script> tags that the browser executes when it receives the closing tag. The upside is that there's no connection reopening but there is more overhead with each message. Moreover, this requires a callback in the global scope that the response calls.
Also, this cannot be used with cross-domain requests as cross-domain iFrame communication presents its own set of problems. The need to trust the server is also a challenge here.
Web Sockets:
These start as a normal HTTP connection but they don't actually follow the HTTP protocol later on. From the programming point of view, things are as simple as they can be. The API is a classic open/callback style on the client side and the server just pushes messages into an open socket. No need to reopen anything after each message.
There still needs to be an open connection, but it's not really an issue here with the browser limits out of the way. The browser knows the connection is going to be open for a while, so it doesn't need to apply the same limits as to normal requests.
These seem like the ideal solution, but there is one major issue: IE<10 doesn't know them. As long as IE8 is alive, web sockets cannot be relied upon. Also, the native Android browser and Opera mini are out as well (ref.).
Still, web sockets seem to be the way to go once IE8 (and IE9) finally dies.
What you see are hanging requests with the timeout of 25 seconds that are used to implement the live update feature. As I already said, the keep-alive message ("h") is used so that the browser doesn't think it's not going to get a response. "h" simply means "nothing happens".
Chrome supports web sockets, so Meteor could have used them with a fallback to long requests, but, frankly, hanging requests are not at all bad once you've got them implemented (sure, the browser connection limit still applies).
I have recently discovered that problems with intermittent failures for users running my application using Internet Explorer is due to a bug in Internet Explorer. The bug is in the HTTP stack, and should be affecting all applications using POST requests from IE. The result is a failure characterized by a request that seems to hang for about 5 minutes (depending on server type and configuration), then fail from the server end. The browser application will error out of the post request after the server gives up. I will explain the IE bug in detail below.
As far as I can tell this will happen with any application using XMLHttpRequest to send POST requests to the server, if the request is sent at the wrong moment. I have written a sample program that attempts to send the POSTS at just those times. It attempts to send continuous POSTs to the server at the precise moment the server closes the connections. The interval is derived from the Keep-Alive header sent by the server.
I am finding that when running from IE to a server with a bit of latency (i.e. not on the same LAN), the problem occurs after only a few POSTs. When it happens, IE locks up so hard that it has to be force closed. The ticking clock is an indication that the browser is still responding.
You can try it by browsing to: http://pubdev.hitech.com/test.post.php. Please take care that you don't have any important unsaved information in any IE session when you run it, because I am finding that it will crash IE.
The full source can be retrieved at: http://pubdev.hitech.com/test.post.php.txt. You can run it on any server that has php and is configured for persistent connections.
My questions are:
What are other people's experiences with this issue?
Is there a known strategy for working around this problem (other than "use another browser")?
Does Microsoft have better information about this issue than the article I found (see below)?
The problem is that web browsers and servers by default use persistent connections as described in RFC 2616 section 8.1 (see http://www.ietf.org/rfc/rfc2616.txt). This is very important for performance--especially for AJAX applications--and should not be disabled. There is however a small timing hole where the browser may start to send a POST on a previously used connection at the same time the server decides the connection is idle and decides to close it. The result is that the browser's HTTP stack will get a socket error because it is using a closed socket. RFC 2616 section 8.1.4 anticipates this situation, and states, "...clients, servers, and proxies MUST be able to recover from asynchronous close events. Client software SHOULD reopen the transport connection and retransmit the aborted sequence of requests without user interaction..."
Internet Explorer does resend the POST when this happens, but when it does it mangles the request. It sends the POST headers, including the Content-Length of the data as posted, but it does not send the data. This is an improper request, and the server will wait an unspecified amount of time for the promised data before failing the request with an error. I have been able to demonstrate this failure 100% of the time using a C program that simulates an HTTP server, which closes the socket of an incoming POST request without sending a response.
Microsoft seems to acknowledge this failure in http://support.microsoft.com/kb/895954. They say that it affects IE versions 6 through 9. That provide a hotfix for this problem, that has shipped with all versions of IE since IE 7. The hotfix does not seem satisfactory for the following reasons:
It is not enabled unless you use regedit to add a key called FEATURE_SKIP_POST_RETRY_ON_INTERNETWRITEFILE_KB895954 to the registry. This is not something I would expect my users to have to do.
The hotfix does not actually fix the broken POST. Instead, if the socket gets closed as anticipated by the RFC, it simply errors out immediately without trying to resent the POST. The application still fails--it just fails sooner.
The following example is a self contained php program that demonstrates the bug. It attempts to send continuous POSTs to the server at the precise moment the server closes the connections. The interval is derived from the Keep-Alive header sent by the server.
We've encountered this problem with IE on a regular basis. There is no good solution. The only solution that is guaranteed to solve the problem is to ensure that the web server keepalive timeout is higher than the browser keepalive timeout (by default with IE this is 60s). Any situation where the web server is set to a lower value can result in IE attempting to reuse the connection and sending a request that gets rejected with a TCP RST because the socket has been closed. If the web server keepalive timeout value is higher than IE's keepalive timeout then IE's reuse of the connections ensure that the socket won't be closed. With high latency connections you'll have to consider the latency time as the time spent in-transit could be an issue.
Keep in mind however, that increasing the keepalive on the server means that an idle connection is using server sockets for that much longer. So you may need to size the server to handle a large number of inactive idle connections. This can be a problem as it may result in a burst of load to the server that the server isn't able to handle.
Another thing to keep in mind. You note that the RFC section 8.1.4 states :"...clients, servers, and proxies MUST be able to recover from asynchronous close events. Client software SHOULD reopen the transport connection and retransmit the aborted sequence of requests without user interaction..."
You forgot a very important part. Here's the full text:
Client software SHOULD reopen the
transport connection and retransmit the aborted sequence of requests
without user interaction so long as the request sequence is
idempotent (see section 9.1.2). Non-idempotent methods or sequences
MUST NOT be automatically retried, although user agents MAY offer a
human operator the choice of retrying the request(s). Confirmation by
user-agent software with semantic understanding of the application
MAY substitute for user confirmation. The automatic retry SHOULD NOT
be repeated if the second sequence of requests fails
An HTTP POST is non-idempotent as defined by 9.1.2. Thus the behavior of the registry hack is actually technically correct per the RFC.
No, generally POST works in IE. It may be an issue, what you are saying,
but it isn't such a major issue to deserve this huge a post.
And when you issue POST ajax request, to make sure every browser inconsistency is covered, just use jquery.
One more thing:
Noone sane will tell you to "use another browser" because IE is widely used and needs to be taken care of (well, except IE6 and for some, maybe even some newer versions)
So, POST has to work in IE, but to make yourself covered for unexpected buggy behavior, use jquery and you can sleep well.
I have never encountered this issue. And our clients mostly runs IE6.
I suspect you've configured your keep-alive timer too long. Most people configure it to be under 1 second because persistent connections are only meant to speed up page loading not service Ajax calls.
If you have keep-alive configured too long you'll face much more severe problems than IE crashing - your server will run out file descriptors to open sockets!*
* note: Incidentally, opening and not closing connections to HTTP servers is a well known DOS attack that tries to force the server to reach its max open socket limit. Which is why most server admins also configure connection timeouts to avoid having sockets open for too long.
I'm creating a simple online multiplayer game, with which two players (clients) can play the game with each other. The data is sent to and fetched from a server, which manages all data concerning this.
The problem I'm facing is how to fetch updates from the server efficiently. The data is fetched using AJAX: every 5 seconds, data is fetched from the server to check for updates. This is however done using HTTP, which means all headers are sent each time as well. The data itself is kept to an absolute minimum.
I was wondering if anyone would have tips on how to save bandwidth in this server/client scenario. Would it be possible to fetch using a custom protocol or something like that, to prevent all headers (like 'Server: Apache') being sent each single time? I basically only need the very data (only 9 bytes) and not all headers (which are like 100 bytes if it's not more).
Thanks.
Comet or Websockets
HTML5's websockets (as mentioned in other answers here) may have limited browser support at the moment, but using long-lived HTTP connections to push data (aka Comet) gives you similar "streaming" functionality in a way that even IE6 can cope with. Comet is rather tricky to implement though, as it is kind of a hack taking advantage of the way browsers just happened to be implemented at the time.
Also note that both techniques will require your server to handle a lot more simultaneous connections than it's used to, which can be a problem even if they're idle most of the time. This is sometimes referred to as the C10K problem.
This article has some discussion of websockets vs comet.
Reducing header size
You may have some success reducing the HTTP headers to the minimum required to save bytes. But you will need to keep Date as this is not optional according to the spec (RFC 2616). You will probably also need Content-Length to tell browser the size of the body, but might be able to drop this and close the connection after sending the body bytes but this would prevent the browser from taking advantage of HTTP/1.1 persistent connections.
Note that the Server header is not required, but Apache doesn't let you remove it completely - the ServerTokens directive controls this, and the shortest setting results in Server: Apache as you already have. I don't think other webservers usually let you drop the Server header either, but if you're on a shared host you're probably stuck with Apache as configured by your provider.
html5 sockets will be the way to do this in the near future.
http://net.tutsplus.com/tutorials/javascript-ajax/start-using-html5-websockets-today/
This isn't possible for all browsers, but it is supported in newer ones(Chrome, Safari). You should use a framework that uses websockets and then gracefully degrades to long polling(you don't want to poll at fixed intervals unless there are always events waiting). This way you will get the benefit of the newer browsers and that pool will continue to expand as people upgrade.
For Java the common solution is Atmosphere: http://atmosphere.java.net. It has a jQuery plugin as well as a abstraction the servlet container level.
Two related questions:
What are the maximum number of concurrent files that a web page is allowed to open (e.g., images, css files, etc)? I assume this value is different in different browsers (and maybe per file type). For example, I am pretty sure that javascript files can only be loaded one at a time (right?).
Is there a way I can use javascript to query this information?
For Internet Explorer see this MSDN article. Basically, unless the user has edited the registry or run a 'internet speedup' program, they are going to have a maximum of two connections if using IE7 or earlier. IE8 tries to be smart about it and can create up to 6 concurrent connections, depending on the server and the type of internet connection. In JavaScript, on IE8, you can query the property window.maxConnectionsPerServer.
For Firefox, the default is 2 for FF2 and earlier, and 6 for FF3. See Mozilla documentation. I'm not aware of anyway to retrieve this value from JavaScript in FF.
Most HTTP servers have little ability to restrict the number of connections from a single host, other than to ban the IP. In general, this isn't a good idea, as many users are behind a proxy or a NAT router, which would allow for multiple connections to come from the same IP address.
On the client side, you can artificially increase this amount by requesting resources from multiple domains. You can setup www1, www2, etc.. alias which all point to your same web server. Then mix up where the static content is being pulled from. This will incur a small overhead the first time due to extra DNS resolution.
One interesting way to get around the X connections per server limit is to map static resources like scripts and images to their own domains... img.foo.com or js.foo.com.
I have only read about this - not actually tried or tested. So please let me know if this doesn't work.
I know at least in Firefox, this value is configurable (network.http.max-connections, network.http.max-connections-per-server, and network.http.pipelining.maxrequests), so I doubt you'll get a definitive answer on this one. The default is 4, however.
What are you attempting to accomplish?
The limitiation is usually the web server. It's common that a web server only allows two concurrent downloads per user.
Active scripting engines like ASP.NET only executes one request at a time per user. Requests for static files are not handles by the scripting engine, so you can still get for example an image while getting an aspx file.
Pages often have content from different servers, like traffic measuring scripts and such. As the download limit is per server you can typically download two files at a time from each server.
As this is a server limitation, you can't find out anything about it using javascript.
There is nothing in HTTP that limits the number of sessions.
However, there are configuration items in FF for one, that specifically set how many total sessions and how many sessions to a single server are allowed. Other browsers may also have this feature.
In addition, a server can limit how many sessions come in totally and from each client IP address.
So the correct answer is:
1/ The number is sessions is limited to the minimum of that imposed by the client (browser) and server.
2/ Because of this, there's no reliable way to query it in JavaScript.
This is both a server and browser limitation. General netiquette holds that no more than 4 simultaneous connections are allowable. Most server allow a maximum of 2 connections by default, and most browsers follow suit. Most are configurable.
No.