I have not been able to get an answer to this anywhere online. I want to remove possible jitter from my nodejs server. I am using socket.io to create connections to node.
If a user goes to a specific part of my website, a connection is started. However, if the user refreshes the site too quickly and often, the connection is created very frequently, and issues arise with my server.
While I realized it's possible this could be solved a couple different ways, I am hoping a server solution is out there. Meaning, whenever a user connects, make sure the user is connected for at least 5 seconds. Then move on. Otherwise, disconnect the user. Thanks for any insight!
First off a little background. With a default configuration, when a socket.io connection starts, it first does 2-5 http connections and then once it has established the "logical" connection, it tries to establish a connection using the webSocket transport. If that is successful, then it keeps that webSocket connection as a long lasting connection and sends socket.io packets over it.
If the client refreshes in the middle of the transition to a webSocket connection, it creates a period of unknown state on the server where the server isn't sure if the user is just still in the middle of the transition to a lasting webSocket connection, if the user is gone entirely already, if the user is having some sort of connection issues or if the user is doing some refresh thing. You can easily end up with a situation where the server thinks there are multiple connections all from the same user in the process of being confirmed. It can be a bit messy if your server is sensitive to that kind of thing.
The quickest thing you can do is to force the connection process to go immediately to the webSocket transport. You can do that in the client by adding an options to your connection code:
let socket = io(yourURL, {transports: ["websocket"]});
You can also configure the server to only accept webSocket connections if you're try to protect against any other types of connections besides just from your own web pages.
This will then go through the usual webSocket connection which starts with a single http request that is then "upgraded" to the webSocket protocol. Once connection, one socket. The server will know right away, either the user is or isn't connected. And, once they've switched over to the webSocket protocol, the server will known immediately if the user hits refresh because the browser will close the webSocket immediately.
The "start with http first" feature in socket.io is largely present because in the early days of webSockets, there were some browsers that didn't yet support them and some network infrastructure (like corporate proxies) that didn't always support webSocket connections. The browser issue is completely gone now. All browsers in use support webSocket connections. I don't personally have any data on the corporate proxies issues, but I don't ever hear about any issues with people using webSockets these days so I don't think that's much of an issue any more either.
So, the above change will get you a quick, confirmed connection and get rid of the confusion around whether a user is or isn't connected early in the connection process.
Now, if you still have users who are messing things up by rapid refresh, you probably need to just implement some protection on your server for that. If you cookie each user that arrives on your server, you could create some middleware that would keep track of how many page requests in some recent time interval have come from the browser with this cookie and just return them an error page that explains they can't make requests that quickly. I would probably implement this at the web page level, not the webSocket level as that will give users better feedback to stop hitting refresh. If it's really a refresh you're trying to protect against and not general navigation on your site, then you can keep a record of a combination cookie and URL and if you see even two of those within a few seconds, then return the error page instead of the expected content. If you redirect to an error page, it forces a more conscious action to go back to the right page again before they can get to the content.
Related
So I have server receiving wss connections in Java hosted on google kubernetes engine.
My website connects to the websocket connection in javascript using the default WebSocket library that comes with javascript.
The client connects to the server and works fine sometimes, but sometimes it fails to connect at all, doesn't get to the on open event or anything. To see if its server-side or client-side issue I made a small java app that connects to the websocket and had it run overnight doing 30 connections at once and it didn't have any problems so I think its something client side, or related to sessions keeping something cached.
Sometimes its very "sticky" once you don't connect once, even if you try to reconnect a million times it won't work, but if you use a different browser, or reset your wifi it will start working again? Also soon as the server restarts it usually works okay, but after a little while (hours or next day) it won't work.
One of the strangest behaviors is that if I inspect the request in firefox's network tab and do "edit and send" on the headers it will ALWAYS work even if the page "stuck" not working no matter how much I refresh. Here is a pic of what I mean:
this always works
The java websocket library I'm using is this one: https://github.com/TooTallNate/Java-WebSocket
I'm just using built in javascript connection library:
var wsocket = new WebSocket(address);
All I want is a consistent connection. I even tried to right code to retry the connection if it doesn't happen after a certain time, but because of the "sticky" behavior that doesn't help much.
Here is a screenshot of wireshark info, I don't really know what any of it means but maybe it can help you guys: wireshark
I feel like it has something to do with the browser/server keeping a session history or something?
On the server I get the following error at least some of the time when this occurs:
The connection was closed because the other endpoint did not respond with a pong in time.
I've a web sockets based chat application (HTML5).
Browser opens a socket connection to a java based web sockets server over wss.
When browser connects to server directly (without any proxy) everything works well.
But when the browser is behind an enterprise proxy, browser socket connection closes automatically after approx 2 minutes of no-activity.
Browser console shows "Socket closed".
In my test environment I have a Squid-Dansguardian proxy server.
IMP: this behaviour is not observed if the browser is connected without any proxy.
To keep some activity going, I embedded a simple jquery script which will make an http GET request to another server every 60 sec. But it did not help. I still get "socket closed" in my browser console after about 2 minutes of no action.
Any help or pointers are welcome.
Thanks
This seems to me to be a feature, not a bug.
In production applications there is an issue related with what is known as "half-open" sockets - see this great blog post about it.
It happens that connections are lost abruptly, causing the TCP/IP connection to drop without informing the other party to the connection. This can happen for many different reasons - wifi signals or cellular signals are lost, routers crash, modems disconnect, batteries die, power outages...
The only way to detect if the socket is actually open is to try and send data... BUT, your proxy might not be able to safely send data without interfering with your application's logic*.
After two minutes, your Proxy assume that the connection was lost and closes the socket on it's end to save resources and allow new connections to be established.
If your proxy didn't take this precaution, on a long enough timeline all your available resources would be taken by dropped connections that would never close, preventing access to your application.
Two minutes is a lot. On Heroku they set the proxy for 50 seconds (more reasonable). For Http connections, these timeouts are often much shorter.
The best option for you is to keep sending websocket data within the 2 minute timeframe.
The Websocket protocol resolves this issue by implementing an internal ping mechanism - use it. These pings should be sent by the server and the browser responds to them with a pong directly (without involving the javascript application).
The Javascript API (at least on the browser) doesn't let you send ping frames (it's a security thing I guess, that prevents people from using browsers for DoS attacks).
A common practice by some developers (which I think to be misconstructed) is to implement a JSON ping message that is either ignored by the server or results in a JSON pong.
Since you are using Java on the server, you have access to the Ping mechanism and I suggest you implement it.
I would also recommend (if you have control of the Proxy) that you lower the timeout to a more reasonable 50 seconds limit.
* The situation during production is actually even worse...
Because there is a long chain of intermediaries (home router/modem, NAT, ISP, Gateways, Routers, Load Balancers, Proxies...) it's very likely that your application can send data successfully because it's still "connected" to one of the intermediaries.
This should start a chain reaction that will only reach the application after a while, and again ONLY if it attempts to send data.
This is why Ping frames expect Pong frames to be returned (meaning the chain of connection is intact.
P.S.
You should probably also complain about the Java application not closing the connection after a certain timeout. During production, this oversight might force you to restart your server every so often or experience a DoS situation (all available file handles will be used for the inactive old connections and you won't have room for new connections).
check the squid.conf for a request_timeout value. You can change this via the request_timeout. This will affect more than just web sockets. For instance, in an environment I frequently work in, a perl script is hit to generate various configurations. Execution can take upwards of 5-10 minutes to complete. The timeout value on both our httpd and the squid server had to be raised to compensate for this.
Also, look at the connect_timeout value as well. That's defaulted to one minute..
I want a server to be able to send a client a message at any time. There may be no messages for several days, yet if one is sent, I want it to be received almost immediately (ideally within 1 second or less). How would I go about this without setting the client up as a server and using port forwarding?
An example of this would be push notifications on a mobile device. Apple can send a push notification to an iPhone almost instantly. However, the iPhone isn't acting as a server. Furthermore, the iPhone may be moving from network to network, and the networks aren't forwarding any ports to the iPhone. How does this work? Assuming there's some sort of persistent connection, how does the solution scale to hundreds of millions of devices connected at the same time?
This question doesn't depend on a particular language. I'm currently working in JS. I'm looking mostly for a conceptual answer, but feel free to answer it in the context of any language if that helps.
Remember that, once a network connection has been established, data can flow over it in either direction. There is no requirement that a peer be on the "receiving" end of a connection to receive data!
In the presence of a non-cellular network*, iPhone push notifications work by having the device connect to an Apple notification server, then wait to receive data from it. No port forwarding is necessary, as this is an outbound connection from the device. If the connection is lost, the device will reconnect to the server and "check in" to see if it missed anything.
*: If the device only has a cellular connection available, making it infeasible to keep a network connection open at all times, it may rely on notifications through the cellular network itself, kind of like SMS. But that's kind of a separate thing altogether.
We have a Rails website where users takeup online quizzes. 20-30% times users will report that due to internet disconnection they were unable to complete the quiz. Is there any way to track how many times internet disconnection occured when a user was on a particular page.
Depends on how and why you want to track it. You are able to do this with the window.navigator.onLine property on the client side which you could attempt to log to analytics when the user is online again to get an idea of how frequently this happens - MDN.
If you wanted to know on the server side, depending on the resources available to you, you may want to create a websocket (MDN) to the client. The websocket will be an ongoing connection between your rails app and the clients browser and any disconnections should be noticeable on the server side which you can keep track of. There are existing libraries for doing websockets with rails but remember that this option will likely take up more server resources as it requires persistent connections with all the online users of your site but you could transmit all your application data over this channel saving the other connections that may otherwise be required.
Another option which would require less resources but possibly more work on your part would be to have a script on the client that polls the server in some way to let it know it's still online and you could link that 'keep-alive' request to the user's information and determine a reasonable time-out if one doesn't arrive which may require a scheduled task on the server.
I've been googling for hours for this issue, but did not find any solution.
I am currently working on this app, built on Meteor.
Now the scenario is, after the website is opened and all the assets have been loaded in browser, the browser constantly makes recursive xhr calls to server. These calls are made at the regular interval of 25 seconds.
This can be seen in the Network tab of browser console. See the Pending request of the last row in image.
I can't figure out from where it originates, and why it is invoked automatically even when the user is idle.
Now the question is, How can I disable these automatic requests? I want to invoke the requests manually, i.e. when the menu item is selected, etc.
Any help will be appriciated.
[UPDATE]
In response to the Jan Dvorak's comment:
When I type "e" in the search box, the the list of events which has name starting with letter "e" will be displayed.
The request goes with all valid parameters and the Payload like this:
["{\"msg\":\"sub\",\"id\":\"8ef5e419-c422-429a-907e-38b6e669a493\",\"name\":\"event_Coll_Search_by_PromoterName\",\"params\":[\"e\"]}"]
And this is the response, which is valid.
a["{\"msg\":\"data\",\"subs\":[\"8ef5e419-c422-429a-907e-38b6e669a493\"]}"]
The code for this action is posted here
But in the case of automatic recursive requests, the request goes without the payload and the response is just a letter "h", which is strange. Isn't it? How can I get rid of this.?
Meteor has a feature called
Live page updates.
Just write your templates. They automatically update when data in the database changes. No more boilerplate redraw code to write. Supports any templating language.
To support this feature, Meteor needs to do some server-client communication behind the scenes.
Traditionally, HTTP was created to fetch dead data. The client tells the server it needs something, and it gets something. There is no way for the server to tell the client it needs something. Later, it became needed to push some data to the client. Several alternatives came to existence:
polling:
The client makes periodic requests to the server. The server responds with new data or says "no data" immediately. It's easy to implement and doesn't use much resources. However, it's not exactly live. It can be used for a news ticker but it's not exactly good for a chat application.
If you increase the polling frequency, you improve the update rate, but the resource usage grows with the polling frequency, not with the data transfer rate. HTTP requests are not exactly cheap. One request per second from multiple clients at the same time could really hurt the server.
hanging requests:
The client makes a request to the server. If the server has data, it sends them. If the server doesn't have data, it doesn't respond until it does. The changes are picked up immediately, no data is transferred when it doesn't need to be. It does have a few drawbacks, though:
If a web proxy sees that the server is silent, it eventually cuts off the connection. This means that even if there is no data to send, the server needs to send a keep-alive response anyways to make the proxies (and the web browser) happy.
Hanging requests don't use up (much) bandwidth, but they do take up memory. Nowadays' servers can handle multiple concurrent TCP connections, so it's less of an issue than it was before. What does need to be considered is the amount of memory associated with the threads holding on to these requests - especially when the connections are tied to specific threads serving them.
Browsers have hard limits on the number of concurrent requests per domain and in total. Again, this is less of a concern now than it was before. Thus, it seems like a good idea to have one hanging request per session only.
Managing hanging requests feels kinda manual as you have to make a new request after each response. A TCP handshake takes some time as well, but we can live with a 300ms (at worst) refractory period.
Chunked response:
The client creates a hidden iFrame with a source corresponding to the data stream. The server responds with an HTTP response header immediately and leaves the connection open. To send a message, the server wraps it in a pair of <script></script> tags that the browser executes when it receives the closing tag. The upside is that there's no connection reopening but there is more overhead with each message. Moreover, this requires a callback in the global scope that the response calls.
Also, this cannot be used with cross-domain requests as cross-domain iFrame communication presents its own set of problems. The need to trust the server is also a challenge here.
Web Sockets:
These start as a normal HTTP connection but they don't actually follow the HTTP protocol later on. From the programming point of view, things are as simple as they can be. The API is a classic open/callback style on the client side and the server just pushes messages into an open socket. No need to reopen anything after each message.
There still needs to be an open connection, but it's not really an issue here with the browser limits out of the way. The browser knows the connection is going to be open for a while, so it doesn't need to apply the same limits as to normal requests.
These seem like the ideal solution, but there is one major issue: IE<10 doesn't know them. As long as IE8 is alive, web sockets cannot be relied upon. Also, the native Android browser and Opera mini are out as well (ref.).
Still, web sockets seem to be the way to go once IE8 (and IE9) finally dies.
What you see are hanging requests with the timeout of 25 seconds that are used to implement the live update feature. As I already said, the keep-alive message ("h") is used so that the browser doesn't think it's not going to get a response. "h" simply means "nothing happens".
Chrome supports web sockets, so Meteor could have used them with a fallback to long requests, but, frankly, hanging requests are not at all bad once you've got them implemented (sure, the browser connection limit still applies).