I'm working on node application that would monitor user's online status. It uses socket.io to update online status for users we "observe" (as in, the users we are aware of on the page we're at). What I would like to introduce now is idle status, which would basically mean that after X time of inactivity (as in no request) the status would change from online to idle.
I do monitor all the sockets thus I know when connection was made, so I thought of using this.
My idea is to use setTimeout on every connection for this particular uses (clearing out the previous one if exists) and in setTimeout I would simply change user's status to idle and emit that status change to observers.
What I'm concerned about is performance and scalability of setting and clearning the timeout on every connection. So the question is, are there any issues in terms of the two above with such approach? Is there a better thing of doing it, perhaps a library that is better at handling such things?
Related
I have implemented pubnub to create a socket connection for receiving real-time messages.
There is one thing which I noticed in my developer tools is that - the pubnub heartbeat state shows pending for a particular interval, mostly between 4.3-5 min.
After going through their documents, I realised the timeout can be modified and the default value is 320 seconds. After implementing this feature for my website I can notice some lag, I am not sure if it is pubnub who is causing the issue.
Please let me understand the the idea behind the pending state. Also, if it has an impact on memory. If yes, then how is the impact related to increase or decrease in the heartbeat interval ?
FYI, my pubnub settings only consist of publisher key, subscriber key, uuid and ssl (true)
PubNub Subscribe Connection and Long Poll Cycle
You are seeing the heartbeat query param but that is not the "presence heartbeat" API. That is the subscribe long poll connection which will remain open until:
a message is published on one of the channels you are subscribed to
or, if no messages were published on one of the subscribe channel after 280s, the connection is closed (200 response with no messages) and the SDK will open a new subscribe connection.
PENDING Connection
PENDING just means the subscribe connection is Open and waiting for messages to be published. This is expected.
I highly recommend that you do not change this value unless there is a good reason. Did you make it longer or shorter?
Shorter long poll has little value and practically no harm, technically speaking, but will result in more subscribe/edge transactions.
Longer long poll has an actual technical downside in that your client will disconnect after the 280s expiration but will not reconnect until the end of the new custom expiration time that you set for the client.
The only time you should set the value shorter is if you have an ISP that proactively closes "idle" (pending) connections quicker than 280s. This is very rare but it does happen.
And you will likely see that the subscribe connection gets CANCELED. This happens when the client app changes its channel subscription list: subscribe to a new channel or unsubscribe from an existing channel.
No Impact on Memory
But you are asking if there is some sort of impact on memory. The answer to that is - it should NOT have a negative impact. If you follow Nicolas Fodor's answer/advice, you might be able to confirm that but 1000's of customers into this, we have not had any memory issues with our JavaScript SDK related to this. Just be sure you are using the latest version of our SDKs and report any bugs/issues you find to PubNub Support with full details.
Presence Heartbeat
One more thing about the heartbeat query param value - it typically defaults to 300 (seconds) which is only important when you are using PubNub Presence. If the PubNub server doesn't hear from a client within that 300 second (or whatever it is set to) period, a presence timeout event, on behalf of that client, is sent to anyone listening for presence events. A timeout is like a delayed leave event.
See also:
Connection Managment Docs
Detect and Manage Presence Events
Simple way to find out would be to check performances under load testing before and after the parameter change and without changing any other parameter. If a cause is established you can then vary the parameter value to assess the elasticity of the side effect.
I discovered SSE (Server Sent Events) pretty late, but I can't seem to figure out some use cases for it, so that it would be more efficient than using setInterval() and ajax.
I guess, if we'd have to update the data multiple times per second then having one single connection created would produce less overhead. But, except this case, when would one really choose SSE?
I was thinking of this scenario:
A new user comment from the website is added in the database
Server periodically queries DB for changes. If it finds new comment, send notification to client with SSE
Also, this SSE question came into my mind after having to do a simple "live" website change (when someone posts a comment, notify everybody who is on the site). Is there really another way of doing this without periodically querying the database?
Nowadays web technologies are used to implmement all sort of applications, including those which need to fetch constant updates from the server.
As an example, imagine to have a graph in your web page which displays real time data. Your page must refresh the graph any time there is new data to display.
Before Server Sent Events the only way to obtain new data from the server was to perform a new request every time.
Polling
As you pointed out in the question, one way to look for updates is to use setInterval() and an ajax request. With this technique, our client will perform a request once every X seconds, no matter if there is new data or not. This technique is known as polling.
Events
Server Sent Events on the contrary are asynchronous. The server itself will notify to the client when there is new data available.
In the scenario of your example, you would implement SSE such in a way that the server sends an event immediately after adding the new comment, and not by polling the DB.
Comparison
Now the question may be when is it advisable to use polling vs SSE. Aside from compatibility issues (not all browsers support SSE, although there are some polyfills which essentially emulate SSE via polling), you should focus on the frequency and regularity of the updates.
If you are uncertain about the frequency of the updates (how often new data should be available), SSE may be the solution because they avoid all the extra requests that polling would perform.
However, it is wrong to say in general that SSE produce less overhead than polling. That is because SSE requires an open TCP connection to work. This essentially means that some resources on the server (e.g. a worker and a network socket) are allocated to one client until the connection is over. With polling instead, after the request is answered the connection may be reset.
Therefore, I would not recommend to use SSE if the average number of connected clients is high, because this could create some overhead on the server.
In general, I advice to use SSE only if your application requires real time updates. As real life example, I developed a data acquisition software in the past and had to provide a web interface for it. In this case, a lot of graphs were updated every time a new data point was collected. That was a good fit for SSE because the number of connected clients was low (essentially, only one), the user interface should update in real-time, and the server was not flooded with requests as it would be with polling.
Many applications do not require real time updates, and thus it is perfectly acceptable to display the updates with some delay. In this case, polling with a long interval may be viable.
I am implementing multiplayer turn based game using Node.js for server side scripting.
Game is just like monopoly where multiple room and single room have multiple players.
I want to implement timer for all rooms, i have gone through many article, I have two options as follows:
1} I will emit with current time to each player at once and they will process timer and emit to server on turn or time up.
2} I may manage timer at server and emit at every second but it will create load to server, also confuse as Node.js is single threaded then how will i manage multiple setInterval() for multiple room. it will add in queue and will create latency.
So please assist me best option.
A hybrid approach may suit best for this problem. You could start, initially, by receiving the current value of the timer from the server. Subsequently, the clients can run the timer down independently without requesting the server for the current time. If accuracy is important to this project, then you can request the server every 15 seconds or so to adjust for time drift that may cause discrepancies between the server and its clients.
Also, note that even though Node.js is single threaded it is nonetheless inherently asynchronous.
I was thinking about extending the functionality of node.js server running Socket.io which I am currently using to allow a client (iOS app) to communicate with a server, so that it could have persistent session data between connection.
On initial connection the server passes the client a session id which the client will store and pass to the server later on if it reconnects after disconnecting, this would allow the client to resume its session without having to re-provide the server with certain information about its current state (obviously when it comes to actual implementation it will be more secure than this).
I want to make it so that the session eventually expires, so it has a max lifetime or if it hasn't been continued after a certain time it times-out. To do this I was thinking of using timers for each session. Im not actually sure how node.js or javascript timers (setTimeout) work in the background and am concerned that having 1000s of session timers could lead to a lot of memory/cpu usage. Could this be a potential issue, should I have a garbage collector that cycles every minute or so and deletes expired session data? What is the kind of most optimal way in terms of least impact on performance method I can do to accomplish this, or are timers already exactly that?
They are used frequently for timeouts, and are very efficient in cpu.
// ten_thousand_timeouts.js
for (var i=0;i<=10000;i++) {
(function(i){
setTimeout(function(){
console.log(i);
},1000)
})(i)
}
With 10,000 the results of logs only took .336 seconds and the act of logging it to the console took most of that time.
//bash script
$> time node ten_thousand_timeouts.js
1
...
9999
10000
real 0m1.336s
user 0m0.275s
sys 0m0.146s
I cannot imagine this being an issue for your use case.
I've been looking for a simpler way than Comet or Long-Polling to push some very basic ajax updates to the browser.
In my research, I've seen that people do in fact use Javascript timers to send Ajax calls at set intervals. Is this a bad approach? It almost seems too easy. Also consider that the updates I'll be sending are not critical data, but they will be monitoring a process that may run for several hours.
As an example - Is it reliable to use this design to send an ajax call every 10 seconds for 3 hours?
Thanks, Brian
Generally, using timers to update content on a page via Ajax is at least as robust as relying on a long-lived stream connection like Comet. Firewalls, short DHCP leases, etc., can all interrupt a persistent connection, but polling will re-establish a client connection on each request.
The trade-off is that polling often requires more resources on the server. Even a handful of clients polling for updates every 10 seconds can put a lot more load on your server than normal interactive users, who are more likely to load new pages only every few minutes, and will spend less time doing so before moving to another site. As one data point, a simple Sinatra/Ajax toy application I wrote last year had 3-5 unique visitors per day to the normal "text" pages, but its Ajax callback URL quickly became the most-requested portion of any site on the server, including several sites with an order of magnitude (or more) higher traffic.
One way to minimize load due to polling is to separate the Ajax callback server code from the general site code, if at all possible, and run it in its own application server process. That "service middleware" service can handle polling callbacks, rather than giving up a server thread/Apache listener/etc. for what effectively amounts to a question of "are we there yet?"
Of course, if you only expect to have a small number (say, under 10) users using the poll service at a time, go ahead and start out running it in the same server process.
I think that one thing that might be useful here is that polling at an unchanging interval is simple, but is often unnecessary or undesirable.
One method that I've been experimenting with lately is having positive and negative feedback on the poll. Essentially, an update is either active (changes happened) or passive (no newer changes were available, so none were needed). Updates that are passive increase the polling interval. Updates that are active set the polling interval back to the baseline value.
So for example, on this chat that I'm working on, different users post messages. The polling interval starts off at the high value of 5 seconds. If other site users are chatting, you get updated every 5 secs about it. If activity slows down, and no-one is chatting since the latest message was displayed, the polling interval gets slower and slower by about a second each time, eventually capping at once every 3 minutes. If, an hour later, someone sends a chat message again, the polling interval suddenly drops back to 5 second updates and starts slowing.
High activity -> frequent polling. Low activity -> eventually very infrequent polling.