I've been looking for a simpler way than Comet or Long-Polling to push some very basic ajax updates to the browser.
In my research, I've seen that people do in fact use Javascript timers to send Ajax calls at set intervals. Is this a bad approach? It almost seems too easy. Also consider that the updates I'll be sending are not critical data, but they will be monitoring a process that may run for several hours.
As an example - Is it reliable to use this design to send an ajax call every 10 seconds for 3 hours?
Thanks, Brian
Generally, using timers to update content on a page via Ajax is at least as robust as relying on a long-lived stream connection like Comet. Firewalls, short DHCP leases, etc., can all interrupt a persistent connection, but polling will re-establish a client connection on each request.
The trade-off is that polling often requires more resources on the server. Even a handful of clients polling for updates every 10 seconds can put a lot more load on your server than normal interactive users, who are more likely to load new pages only every few minutes, and will spend less time doing so before moving to another site. As one data point, a simple Sinatra/Ajax toy application I wrote last year had 3-5 unique visitors per day to the normal "text" pages, but its Ajax callback URL quickly became the most-requested portion of any site on the server, including several sites with an order of magnitude (or more) higher traffic.
One way to minimize load due to polling is to separate the Ajax callback server code from the general site code, if at all possible, and run it in its own application server process. That "service middleware" service can handle polling callbacks, rather than giving up a server thread/Apache listener/etc. for what effectively amounts to a question of "are we there yet?"
Of course, if you only expect to have a small number (say, under 10) users using the poll service at a time, go ahead and start out running it in the same server process.
I think that one thing that might be useful here is that polling at an unchanging interval is simple, but is often unnecessary or undesirable.
One method that I've been experimenting with lately is having positive and negative feedback on the poll. Essentially, an update is either active (changes happened) or passive (no newer changes were available, so none were needed). Updates that are passive increase the polling interval. Updates that are active set the polling interval back to the baseline value.
So for example, on this chat that I'm working on, different users post messages. The polling interval starts off at the high value of 5 seconds. If other site users are chatting, you get updated every 5 secs about it. If activity slows down, and no-one is chatting since the latest message was displayed, the polling interval gets slower and slower by about a second each time, eventually capping at once every 3 minutes. If, an hour later, someone sends a chat message again, the polling interval suddenly drops back to 5 second updates and starts slowing.
High activity -> frequent polling. Low activity -> eventually very infrequent polling.
Related
I have a website up and running. The website worked fine on localhost with no such errors but after I put it online it started showing 507 insufficient storage page whenever two or three users used the same page at the same time.
For example there is a webpage chat.php which runs an ajax request to update the chats every 700 milliseconds. Side by side two ajax requests keep checking for messages and notifications. These ajax requests are completed using javascript's setInterval method. When this page is accessed concurrently by two or more users the page does not load and shows the error and sometimes the page shows 429 too many requests error. So at the same time maximum 4 requests can occur at the user's end and that too if the scripts run at the same time. Could this occur because of the limited entry processes? The hosting provides me with 10 limited entry processes by default. Please reply and leave a comment if you want me to post the setInterval method code even though I think the problem is something else here.
For example there is a webpage chat.php which runs an ajax request to update the chats every 700 milliseconds.
These ajax requests are completed using javascript's setInterval method.
When this page is accessed concurrently by two or more users the page does not load and shows the error and sometimes the page shows 429 too many requests error.
So at the same time maximum 4 requests can occur at the user's end and that too if the scripts run at the same time.
The hosting provides me with 10 limited entry processes by default.
Please take some time to read through (your own) quotes.
You stating that you AJAX the te server every 700ms and you do so by using setInterval. There is a maximum of 4 requests per user and 10 in total. If there is 2 or more visiters stuff goes haywire.
I think multiple things may be causing issues here:
You hit the 10 requests limit because of multiple users.
When 2 users hit 4 requests your at 8, if anythings else does a requests on the server you very quickly hit the maximum of 10. With 3 users with 4 requests your at 12 which according to your question hits your limit.
You might be DOSsing your own servers.
Using setInterval to do AJAX requests is bad. really bad. The problem is that if you request your server every 700ms and the server needs more than those 700ms to respond you'll stacking up requests. You will eventually hit whatever the limit is with just one user. (although in certain cases the browser might protect you).
How to fix
I think 10 connections (if it's actually 10 connections, which is unclear to me) is very low. However you should refactor your code to avoid using setInterval. You should use something like Promise to keep track of when a requests ends before scheduling the new one so you prevent stacks of requests piling up. Keep as a rule of thumb that you should never use setInterval unless you have a very good reason to do so. It's almost always better to use some more intelligent scheduling.
You also might want to look into being more efficient with those requests, can you merge the call to check for messages and notifications?
I discovered SSE (Server Sent Events) pretty late, but I can't seem to figure out some use cases for it, so that it would be more efficient than using setInterval() and ajax.
I guess, if we'd have to update the data multiple times per second then having one single connection created would produce less overhead. But, except this case, when would one really choose SSE?
I was thinking of this scenario:
A new user comment from the website is added in the database
Server periodically queries DB for changes. If it finds new comment, send notification to client with SSE
Also, this SSE question came into my mind after having to do a simple "live" website change (when someone posts a comment, notify everybody who is on the site). Is there really another way of doing this without periodically querying the database?
Nowadays web technologies are used to implmement all sort of applications, including those which need to fetch constant updates from the server.
As an example, imagine to have a graph in your web page which displays real time data. Your page must refresh the graph any time there is new data to display.
Before Server Sent Events the only way to obtain new data from the server was to perform a new request every time.
Polling
As you pointed out in the question, one way to look for updates is to use setInterval() and an ajax request. With this technique, our client will perform a request once every X seconds, no matter if there is new data or not. This technique is known as polling.
Events
Server Sent Events on the contrary are asynchronous. The server itself will notify to the client when there is new data available.
In the scenario of your example, you would implement SSE such in a way that the server sends an event immediately after adding the new comment, and not by polling the DB.
Comparison
Now the question may be when is it advisable to use polling vs SSE. Aside from compatibility issues (not all browsers support SSE, although there are some polyfills which essentially emulate SSE via polling), you should focus on the frequency and regularity of the updates.
If you are uncertain about the frequency of the updates (how often new data should be available), SSE may be the solution because they avoid all the extra requests that polling would perform.
However, it is wrong to say in general that SSE produce less overhead than polling. That is because SSE requires an open TCP connection to work. This essentially means that some resources on the server (e.g. a worker and a network socket) are allocated to one client until the connection is over. With polling instead, after the request is answered the connection may be reset.
Therefore, I would not recommend to use SSE if the average number of connected clients is high, because this could create some overhead on the server.
In general, I advice to use SSE only if your application requires real time updates. As real life example, I developed a data acquisition software in the past and had to provide a web interface for it. In this case, a lot of graphs were updated every time a new data point was collected. That was a good fit for SSE because the number of connected clients was low (essentially, only one), the user interface should update in real-time, and the server was not flooded with requests as it would be with polling.
Many applications do not require real time updates, and thus it is perfectly acceptable to display the updates with some delay. In this case, polling with a long interval may be viable.
I play with Javascript AJAX and long-polling.
Try to find best value for server response timeout.
I read many docs but couldn't find a detailed explanation for timeout.
Someone choose 20 secs, other 30 secs...
I use logic like on diagram
How can I choose better value for timeout?
Can I use 5 minutes?
Is it normal practice?
PS: Possible Ajax client internet connections: Ethernet RJ-45, WiFi, 3G, 4G, also, with NAT, Proxy.
I worry about connection can be dropped by third party in some cases by long timeout.
Maybe its your grasp of English which is the problem, but its the lifetime of the connection (time between connection opening and closing) you need to worry about more than the timeout (length of time with no activity after which the connection will be terminated).
Despite the existence of websockets, there is still a lot of deployed hardware which will drop connections regardless of activity (and some which will look for inactivity) where it thinks the traffic is HTTP or HTTPS - sometimes as a design fault, sometimes as a home-grown mitigation to sloloris attacks. That you have 3G and 4G clients means you can probably expect problems with a 5 minute lifespan.
Unfortunately there's no magic solution to knowing what will work universally. The key thing is to know how widely distributed your users are. If they're all on your LAN and connecting directly to the server, then you should be able to use a relatively large value, however setting the duration to unlimited will reveal any memory leaks in your app - sometimes its better to do refresh every now and again anyway.
Taking the case where there is infrastructure other than hubs and switches between your server and the clients, you need to provide a mechanism for detecting and re-establishing a dropped connection regardless of the length of time. When you have worked out how to do this, then:
dropped connections are only a minor performance glitch and do not have a significant effect on the functionality
it's trivial to then add the capability to log dropped connections and thereby determine the optimal connection time to eliminate the small problem described in (1)
Your English is fine.
TL;DR - 5-30s depending on user experience.
I suggest long poll timeouts be 100x the server's "request" time. This makes a strong argument for 5-20s timeouts, depending on your urgency to detect dropped connections and disappeared clients.
Here's why:
Most examples use 20-30 seconds.
Most routers will silently drop connections that stay open too long.
Clients may "disappear" for reasons like network errors or going into low power state.
Servers cannot detect dropped connections. This makes 5 min timeouts a bad practice as they will tie up sockets and resources. This would be an easy DOS attack on your server.
So, < 30 seconds would be "normal". How should you choose?
What is the cost-benefit of making the long-poll connections?
Let's say a regular request takes 100ms of server "request" time to open/close the connection, run a database query, and compute/send a response.
A 10 second timeout would be 10,000 ms, and your request time is 1% of the long-polling time. 100 / 10,000 = .01 = 1%
A 20 second timeout would be 100/20000 = 0.5%
A 30 second timeout = 0.33%, etc.
After 30 seconds, the practical benefit of the longer timeout will always be less than: 0.33% performance improvement. There is little reason for > 30s
Conclusion
I suggest long poll timeouts be 100x the server's "request" time. This makes a strong argument for 5-20s timeouts, depending on your urgency to detect dropped connections and disappeared clients.
Best practice: Configure your client and server to abandon requests at the same timeout. Give the client extra network ping time for safety. E.g. server = 100x request time, client = 102x request time.
Best practice: Long polling is superior to websockets for many/most use cases because of the lack of complexity, more scalable architecture, and HTTP's well-known security attack surface area.
I have a long-term goal of eventually creating a chat sort by any means, but for now I'd like to just have a simple one with some Mysql and ajax calls.
To make the chat seem instant, I'd like to have the ajax request interval as fast as possible. I get the feeling if it's as low or lower than a second, it's going to bog down the browser, the user's internet, or my server.
Assuming the server doesn't return anything, how much bandwidth and cpu/memory would the client use with constant, one second apart ajax calls?
edit: I'm still open to suggestions on how I can do a chat server. Anything that's possible with free hosting from x10 or 000webhost. I've been told of Henoku but I have no clue how to use it.
edit: Thanks for the long polling suggestion, but that uses too much cpu on the servers.
One technique that can be used is to use a long-running ajax request. The client asks if there's any chat data. The server receives the request. If there's chat data available, it returns that data immediately. If there is no chat data, it hangs onto the request for some period of time (perhaps two minutes) and if some chat data appears during that two minutes, the web request returns immediately with that data. If the full two minutes elapses and no chat data is received, then the ajax call returns with no data.
The client can then immediately issue another request to wait another two minutes for some data.
To make these "long" http requests work, you just need to make sure that your underlying ajax call has a timeout set for longer than the time you've set it for on the server.
On the server, you need to do an efficient mechanism of waiting for data, probably involving semaphores or something like that because you don't want to be polling internally in the server either.
Doing it this way, you can get near instantaneous response on the client, but only be making 30 requests an hour.
To be friendly to the battery of a laptop or mobile device, you need to be sensitive to when your app isn't actually being used (browser not displayed, not the current tab, etc...) and stop the requests during that time.
As to your other questions, repeated ajax calls (as long as they are spaced at least some small amount of time apart) don't really use much in the way of CPU or memory. They may use battery if they keep the computer from going into an idle mode.
I have a need to send alerts to a web-based monitoring system written in RoR. The brute force solution is to frequently poll a lightweight controller with javascript. Naturally, the downside is that in order to get a tight response time on the alerts, I'd have to poll very frequently (every 5 seconds).
One idea I had was to have the AJAX-originated polling thread sleep on the server side until an alert arrived on the server. The server would then wake up the sleeping thread and get a response back to the web client that would be shown immediately. This would have allowed me to cut the polling interval down to once every 30 seconds or every minute while improving the time it took to alert the user.
One thing I didn't count on was that mongrel/rails doesn't launch a thread per web request as I had expected it to. That means that other incoming web requests block until the first thread's sleep times out.
I've tried tinkering around with calling "config.threadsafe!" in my configuration, but that doesn't seem to change the behavior to a thread per request model. Plus, it appears that running with config.threadsafe! is a risky proposition that could require a great deal more testing and rework on my existing application.
Any thoughts on the approach I took or better ways to go about getting the response times I'm looking for without the need to deluge the server with requests?
You could use Rails Metal to improve the controller performance or maybe even separate it out entirely into a Sinatra application (Sinatra can handle some serious request throughput).
Another idea is to look into a push solution using Juggernaut or similar.
One approach you could consider is to have (some or all of) your requests create deferred monitoring jobs in an external queue which would in turn periodically notify the monitoring application.
What you need is Juggernaut which is a Rails plugin that allows your app to initiate a connection and push data to the client. In other words your app can have a real time connection to the server with the advantage of instant updates.