We have a small set of multiplayer servers using node.js that are currently serving roughly 1 million messages a minute during peak usage. Is there a way to 'gracefully' restart the server without causing sockets to drop? Basically, I'm wondering what is the best way to handle restarts were it would normally be very disruptive to players?
When a process exits, the OS cleans up any sockets that belong to it by closing them. So, there's no way to just do a simple server restart and preserve your socket connections.
In some operating systems, you can pass ownership of a socket from one process to another so it might be technically feasible for you to create a temporary process or perhaps a previously existing parent process), pass ownership of the sockets to that other process, restart your server, then transfer ownership back to the newly started process. I've never tried this (or heard about it being done), but it sounds like something that might be feasible.
Here's some information on transferring a socket to a child process using child.send() in node.js. It appears this can only be done for a node.js socket created by the net module and there are some caveats about doing it, but it is possible.
If not, the usual work-around is have the clients automatically reconnect when their connection is closed. Done properly, this can be fairly transparent to the client (except for the momentary time when the server is not running).
use redis or some in-memory database for storing connection so that you can easily reconnect even after server restart without loosing any sessions or connection. Try this if it suits your need. Also please note during restart connection may drop but due to having persistence you will be connected again very easily.
socket.io-redis
Related
Hi I'm running a standard (example) socket.io chatroom, but I'm running into a problem I'm not sure how to debug.
The chatroom seems to functioning normally, clients can broadcast their messages, but occasionally on connection it is as if they are alone in the chatroom when they are not -- other clients don't see their presence or messages. It frequently happens when clients are not joining the socket around the same time.
It is as if they've connected to an entirely different socket.
I think it might be something to do with cookies and sessions. If the clients clear their sessions they are reunited in the chat.
Perhaps on (or before) connection I could clear session data? How?
There is no requirement for a chat server that clients connect on the same IP and port. Typically, there is a requirement that they connect to the same server, which must maintain a list of client connections to enable chat between them.
Chat works like this:
Server sets up a ServerSocket to accept connections. Clients connect, and these connections are stored on the Server in an array, object or some other form. When the server gets a message event from one of the clients, this message is then broadcast to all the other clients.
Thus, if you have one client who is not receiving any messages and appears to be in an empty room, the issue is likely that they are somehow not part of the same collection of connected clients, part of the same chat app, or not connected at all.
Okay I think I figured it out, I was right and wrong.
I think the clients were connecting to 'entirely different sockets' but it had nothing much to do with cookies and sessions:
I discovered (due to some other really weird bugs) by a study of running processes that somehow an old version of the socket.io server script was clinging to life in the background for some time. I expect clients were connecting to one of the two io server scripts randomly: not good. Working in a sense, but in separate worlds.
Killing those rogue processes seems to have fixed a lot of stuff.
UPDATE
I have a few questions about the combination of Nginx and Nodejs.
I've used Nodejs to create my server and now I'm facing with an issue about catching the server for an actions (writing, removing and etc..).
We are using Redis to lock the server when there are requests to the server, for example if a new user is doing a sign up action all the rest of the requests are waiting until the process is done, or if there is another process (longer one) all the other requests will wait longer.
We thought about creating a Load balancer (using Nginx) that will check if the server is locked, and if the server is locked it will open a new task and won't wait until the first process is done.
I used this tutorial and created a dummy server, then I've struggled with the idea of do this functionality of opening a new ports.
I'm new with load balancing implementation and I will be happy to hear your thoughts and help.
Thank you.
The gist of it is that your server needs to not crash if more than one connection attempt are made to it. Even if you use NGINX as a load balancer and have five different instances of your server running...what happens when six clients try to access your app at once?
I think you are thinking about load balancers slightly wrong. There are different load balancing methods, but the simplest one to think about is "round robin" in which each connection gets forwarded to the next server in the list (the rest are just more robust and complicated versions of this one). When there are no more servers to forward to, the next connection gets forwarded to the first server again (whether or not it is done with its last connection) and the circle starts over. Thus, load balancers aren't supposed to manage "unique connections" from clients...they are supposed to distribute connections among servers.
Your server doesn't necessarily need to accept connections and handle them all at once. But it needs to at least allow connections to queue up without crashing, and then accept and deal with each one by one.
You can go the route you are discussing. That is, you can fire up a unique instance of your server...via Heroku or other...for every single connection that is made to your app. But this is not efficient and will ultimately create more work for you in trying to architect a system that can do that well. Why not just fix your server?
I'm new to Web Sockets in general, but get the main concept.
I am trying to build a simple multiplayer game and would like to have a server selection where I can run sockets on multiple IPs and it will connect the client through that, to mitigate connections in order to improve performance, this is hypothetical in the case of there being thousands of players at once, but would like some insight into how this would work and if there are any resources I can use to integrate this before hand, in order to prevent extra work at a later date. Is this at all possible, as I understand it Node.Js runs on a server and uses the Socket.io dependencies to create sockets within that, so I can't think of a possible solution to route it through another server unless I had multiple sites running it separately.
The first question I have is this:
Are you hosting on AWS or in a local datacenter?
The reason I ask is because SOCKET.io requires sticky sessions to work properly across multiple servers. Due to the fact that SOCKET.io will attempt to upgrade each connection, and because that upgrade request must reach the original server that authorized the session, you'll need to route websocket (TCP) connections back to that original server via sticky sessions. Unfortunately AWS makes this extremely tricky and will require you to learn how to:
A) Modify elastic load balancer policies to forward protocol information
B) Split apart TCP connections from standard web requests using something like HA PROXY or NGINX. This is necessary in order to handle web socket UPGRADE requests properly, as you will be setting TCP to sticky and web requests to round-robin.
C) Attach your socket.io configuration to a common storage source, like Redis (elasticache).
Once you've figured out what's needed for AWS (or if you've got full control over request routing at your local datacenter), you'll want to architect your SOCKET application to use multicast rooms rather than direct socket messaging.
Example:
To send a message to users in game #4444, emit a message to room 'games:4444', rather than direct to the user's socket.
If your socket instance is configured using REDIS, REDIS will automatically take care of maintaining lists of people who are connected to your 'games:4444' channel. Otherwise you'll need to maintain the list yourself using a database or other shared mechanism.
Other than that, there are plenty of resources online that can help you figure out each step along the way. I'd start with understanding something like HA PROXY and how it can help split apart your SOCKETS from your web requests.
I've a web sockets based chat application (HTML5).
Browser opens a socket connection to a java based web sockets server over wss.
When browser connects to server directly (without any proxy) everything works well.
But when the browser is behind an enterprise proxy, browser socket connection closes automatically after approx 2 minutes of no-activity.
Browser console shows "Socket closed".
In my test environment I have a Squid-Dansguardian proxy server.
IMP: this behaviour is not observed if the browser is connected without any proxy.
To keep some activity going, I embedded a simple jquery script which will make an http GET request to another server every 60 sec. But it did not help. I still get "socket closed" in my browser console after about 2 minutes of no action.
Any help or pointers are welcome.
Thanks
This seems to me to be a feature, not a bug.
In production applications there is an issue related with what is known as "half-open" sockets - see this great blog post about it.
It happens that connections are lost abruptly, causing the TCP/IP connection to drop without informing the other party to the connection. This can happen for many different reasons - wifi signals or cellular signals are lost, routers crash, modems disconnect, batteries die, power outages...
The only way to detect if the socket is actually open is to try and send data... BUT, your proxy might not be able to safely send data without interfering with your application's logic*.
After two minutes, your Proxy assume that the connection was lost and closes the socket on it's end to save resources and allow new connections to be established.
If your proxy didn't take this precaution, on a long enough timeline all your available resources would be taken by dropped connections that would never close, preventing access to your application.
Two minutes is a lot. On Heroku they set the proxy for 50 seconds (more reasonable). For Http connections, these timeouts are often much shorter.
The best option for you is to keep sending websocket data within the 2 minute timeframe.
The Websocket protocol resolves this issue by implementing an internal ping mechanism - use it. These pings should be sent by the server and the browser responds to them with a pong directly (without involving the javascript application).
The Javascript API (at least on the browser) doesn't let you send ping frames (it's a security thing I guess, that prevents people from using browsers for DoS attacks).
A common practice by some developers (which I think to be misconstructed) is to implement a JSON ping message that is either ignored by the server or results in a JSON pong.
Since you are using Java on the server, you have access to the Ping mechanism and I suggest you implement it.
I would also recommend (if you have control of the Proxy) that you lower the timeout to a more reasonable 50 seconds limit.
* The situation during production is actually even worse...
Because there is a long chain of intermediaries (home router/modem, NAT, ISP, Gateways, Routers, Load Balancers, Proxies...) it's very likely that your application can send data successfully because it's still "connected" to one of the intermediaries.
This should start a chain reaction that will only reach the application after a while, and again ONLY if it attempts to send data.
This is why Ping frames expect Pong frames to be returned (meaning the chain of connection is intact.
P.S.
You should probably also complain about the Java application not closing the connection after a certain timeout. During production, this oversight might force you to restart your server every so often or experience a DoS situation (all available file handles will be used for the inactive old connections and you won't have room for new connections).
check the squid.conf for a request_timeout value. You can change this via the request_timeout. This will affect more than just web sockets. For instance, in an environment I frequently work in, a perl script is hit to generate various configurations. Execution can take upwards of 5-10 minutes to complete. The timeout value on both our httpd and the squid server had to be raised to compensate for this.
Also, look at the connect_timeout value as well. That's defaulted to one minute..
I just started learning Node.js and as I was learning about the fs.watchFile() method, I was wondering if a chat website could be efficiently built with it (and fs.writeFile()), against for example Socket.IO which is stable, but I believe not 100% stable (several fallbacks, including flash).
Using fs.watchFile could perhaps also be used to keep histories of the chats quite simply (as JSON would be used on the spot).
The chat files could be formatted in JSON in such a way that only the last chatter's message is brought up to the DOM (or whatever to make it efficient to 'fetch' messages when the file gets updated).
I haven't tried it yet as I still need to learn more about Node, and even more to be able to compare it with Socket.IO, but what's your opinion about it? Could it be an efficient/stable way of doing chats?
fs.watchFile() can be used to watch changes to the file in the local filesystem (on the server). This will not solve your need to update all clients chat messages in their browsers. You'll still need web sockets, AJAX or Flash for that (or socket.io, which handles all of those).
What you could typically do in the client is to try to use Web Sockets. If browser does not support them, try to use XMLHttpRequest. If that fails, fallback to Flash. It's a lot of programming to do, and it has to be handled by node.js server as well. Socket.io does that for you.
Also, socket.io is pretty stable. Fallback to Flash is not due to it's instability but due to lack of browser support for better solutions (like Web Sockets).
Storing chat files in flatfile JSON is not a good idea, because if you are going to manipulating the files, you would have to parse and serialize entire JSON objects, which would become very slow as the size of the JSON object increased. The watch methods for the filesystem module also don't work on all operating systems.
You also can't compare Node.js to Socket.IO because they are entirely different things. Socket.IO is a Node module for realtime transport between the browser and the server. What you need is dependent on what you're doing. If you need chat history, then you should be using a database such as MongoDB or MySQL. Watching files for changes is not an efficient way and you should just send messages as they received.
In conclusion no, using fs.watchFile() and fs.writeFile() is a very bad idea, because race conditions would occur due to concurrent file writes, besides that fs.watchFile() uses polling to check if a file has changed. You should instead use Socket.IO and push messages to other clients / store them in a database as they are received.
You can use long pooling method using javascript setTimeout and setInterval
long pooling
basically long pooling working on Ajax reqest and server responce time.
server will respond after a certain time (like after 50 seconds ) if there is not notification or message else it will respond with data and from client side when client gets response client javascript makes another request for new update and wait till response this process is endless until server is running