I am developing a website that has some sort of realtime update.
Now the website is generated with a javascript variable of the current ID of the dataset.
Then in an interval of some seconsd an AJAX call is made passing on the current ID, and if theres something new the server returns it along with the latest ID which is then updated in the javascript.
Very simple, but here comes the Problem.
If the user opens the same page multiple times, every page does this AJAX requests which produces heavy serverload.
Now I thought about the following approach:
The website is loaded with a javascript variable of the current timestamp and ID of the current dataset.
My desired refresh interval is for example 3 seconds.
In the website an interval counter counts up every seconds, and everytime the timestamp reaches a state where (timestmap % 3===0) returns true, the content is updated.
The link looks like http://www.example.com/refresh.php?my-revision=123×tamp=123456
Now this should ensure that every browser window calls the same URL.
Then I can turn on browser level caching.
But I don't really like this solution.
I would prefer adding another layer of data sharing in a Cookie.
This shouldn't be much of a problem, I can just store every request in a cookie named by timestamp and data revision with a TTL of 10 seconds or so and check for its exitence first.
BUT
The pages will do the request at the same time. So the whole logic of browser caching and cookie might not work because the requests occour simultanously and not one after another.
So I thought about limiting the current connections to 1 server side. But then I would need at least an extra vhost, because I really dont want to do that for the whole page.
And this lets me run into problems concerning cross-site policies!
Of course there are some super complicated load balancing solutions / server side solusions bound to request uri and ip adress or something but thats all extreme overkill!
It must be a common problem! Just think of facebook chat. I really don't think they do all the requests in every window you have open...
Any ideas? I'm really stuck with this one!
Maby I can do some inter-window Javascript communication? Shouldnt be a problem if its all on the same domain?
A thing I can do of course is server side caching. Which avoids at least DB Connections and intensive calculations... but it still is an request which I would like to avoid.
You might want to check out Comet and Orbited .
This is best solved with server push technology.
The first thing is: Do server-side caching anyway, using Memcache or Redis or whatever. So you're defended against three machines doing the requests. But you knew that.
I think you're onto the right thing with cookies, frankly (but see below for a more modern option) — they are shared by all window instances, easily queried, etc. Your polling logic could look something like this:
On polling interval:
Look at content cookie: Is it fresher than what you have? If so, use it and you're done.
Look at status cookie; is someone else actively polling (e.g., cookie is set and not stale)? If yes, come back in a second.
Set status cookie: I'm actively polling at (now).
Do request
On response:
If the new data is newer than the (possibly updated) contents of the content cookie, set the content cookie to the new data
Clear status cookie if you're the one who set it
Basically, the status cookie acts as a semaphore indicating to all window instances that someone, somewhere is on the job of updating the content.
Your content cookie might contain the content directly, or if your content is large-ish and you're worried about running into limits, you could have each page have a hidden iframe, each with a unique name, and have your Ajax update write the output to the iframe. The content cookie would publish the name of the most up-to-date iframe, and other windows seeing that there's fresh content could use window.open to get at that iframe (since window.open doesn't open a window if you use the name of an existing one).
Be alert to race conditions. Although JavaScript within any given page is single-threaded (barring the explicit use of web workers), you can't expect that JavaScript in the other windows is necessarily running on the same thread (it is on some browsers, not on others — heck, on Chrome it's not even the same process). I also don't know that there's any guarantee of atomicity in writing cookies, so you'll want to be vigilant.
Now, HTML5 defines some useful inter-document communication mechanisms, and so you might consider looking to see if those exist and using them before falling back on this cookie approach, since they'll work in modern browsers today but not in older browsers you're probably having to deal with right now. Still, on the browsers that support it, great!
Web storage might also be an option worth investigating as an aspect of the above, but your clients will almost certainly have to give your app permissions and it's also a fairly new thing.
Related
The way to delete cookies in javascript is to set the expiry date to be in the past. Now this doesn't actually delete the cookie, at least in Firefox. It just means the cookie will be deleted on browser close.
This is a problem for us: we have a product that involves archiving web pages from potentially many sites, with all this content stored on our server. And to make sure that pages render properly we include all js as well. However often cookies are set by js, and given that the page is cached on our server, these cookies are set under our domain.
So over time cookies from dozens of archived sites build up under our domain. And eventually the Cookie header exceeds the max content length, resulting in an HTTP 400 error code.
And because our clients are mostly in corporate environments they never reboot their machines or close their browsers: they can be left on for months. So this "soft" delete doesn't work, at least not reliably.
Is there any way to physically remove cookies intra-session in javscript? Or alternatively, is there any way to stop them being set?
It's not possible. Period. I've been struggling with this for several weeks without finding a solution.
Whoever invented the cookie getter/setter should be %insert_painful_punishment_here%.
Particularly Internet Exploder is a beast when it comes to deleting cookies. I can't remember the exact issue, but I think it involved https and cookie names containing ;.
All I can offer is a workaround: Send a response body with your 400 response, something like 'please restart your browser'.
In addition to setting the expiration in the past, set the value to an empty string. This will at least reduce the size of the cookie immediately.
I would think that cookies should be deleted immediately in all browsers. For example, when I log out of a website, Firefox does not require me to close my browser to delete the cookie that shows that I am logged into the site. If this isn't happening, I suggest you look into Firefox bugs and possibly open a new one with them.
In the meantime, I'd look at my web server and see if it is possibly to set the max content length to something higher than it already is.
You could overwrite the cookie with a new one.
"It is because we are NOT using iframes that we have this issue. The cached page is being rendered by our server, so any cookies get set under our domain." --OP
If you have no control over the javascript that is setting the cookies (which seems extremely odd, why do you not have control?), you can constantly read and empty the cookie, dumping the data to another larger database (preferably server-side, or perhaps HTML5 client storage).
I am looking for a reliable way to log out a user or abandon their session when the browser is closed. Is there a way to do this with tabbed browsers?? Any suggestions are appreciated. Thanks!
There is no reliable way to do this immediately when the client closes the browser. There's the beforeunload event, but even then, when you fire an ajax request during this event, it's not guaranteed to ever reach the server. Still then, you've a problem with multiple browser tabs.
Most reliable way is to have a relatively short session timeout in the server side (e.g. 1 minute) and introduce an ajaxbased heartbeat on the client side (e.g. every 30 seconds) to keep the session alive.
There may be better ways depending on the sole functional requirement for which you thought that this is the solution. For example, if your actual intent is to restrict all logins to 1 per registered user, then you'd better collect all logins and the associated sessions and then compare this on each login and invalidate the earlier session if any present. This way it'll work as well on clients with JS disabled.
If you aren't using cookies to preserve your users' login information, it should log them out when they close the browser, because any session cookies should be killed when the browser closes.
Obviously this isn't always the case (see here for an example of Firefox preserving login information after logging out) because "session restore" features we now blur the line between what is considered a "single browser session". (Personally, I think this should be classified as a bug, but that is only my opinion).
There are two possible techniques. The first would be (as yojimbo87 mentions before me) to use web sockets to keep a connection between client and server, and when the socket closes, kill the session. The issue here is that web sockets support is limited, and certainly not possible on anything other than bleeding edge browsers (FF4, Chrome, IE9, etc).
An alternative could be to use AJAX to constantly poll the server to tell it that the page is still being viewed, so if, for example, you send a keep-alive request via AJAX every 30 seconds, you'd store the timestamp of the request in the session. If the user then comes back to the page and the time difference between the current request and the last request is more than say... 45 seconds (accounting for latency), you'd know that the user closed their browser and need to log in again.
In both of these situations, there is however a fatal flaw, and that is that they rely on JavaScript. If the user doesn't have JavaScript enabled, you'd end up ruining the user experience with constant login prompts, which is obviously a bad idea.
In my opinion, I think its reasonable to simply rely on session cookies being deleted by the browser when the user closes the browser window, because that is what they are supposed to do. You as a developer can't be blamed when the client browser performs undesirable behaviour, since its entirely out of your hands, and there's no functional workaround.
A feasible technique would be to use AJAX to send keep-alive requests to your servers quite often — e.g. every one minute. Then you could abandon a session as soon as a keep-alive (or a few in sequence) is not received as expected.
Otherwise, there's no reliable way to achieve that. Since there's not a persistent connection between the browser and the server you can't detect situations that are out-of-control of any JavaScript code you might have running in the browser. For example, when there's a network failure you might want to close the session as well even though the browser's window is still opened. Hence, to make the system robust enough, you should detect network outages as a “side-effect” of the keep-alive mechanism from the browser (e.g. like Gmail does it).
Unless you are using WebSockets or some kind of long polling for each tab which tracks the connection with client in "real time", you will probably have to wait until the session is timed out on the server side.
You can do this via a combination of Jquery,Ajax and PHP
The Jquery
function updatestatusOFF(){
// Assuming we have #shoutbox
$('#connection').load('connection.php?user=<?php echo $_SESSION['username']; ?>&offline=true');
}
The before unload script
<script>window.onbeforeunload = function() { return updatestatusOFF(); }</script>
and the php you would have to write yourself which i'm more then certain you can do.
it isn't the most reliable but it's the easiest way to implement that. if you want real time reporting .. look into comet
Is it possible to detect HTTP cache hits in order to calculate a cache hit rate?
I'd like to add a snippet of code (JavaScript) to a HTML page that reports (AJAX) whether a resource was available from a client's local cache or fetched from server. I'd then compile some stats to give some insight on the effects of my cache tuning. I'm particularly interested in hit rates for the very first page of a user's visit.
I though about using access logs but that seems imprecise (bots) and cumbersome. Additionally, it wouldn't work with resources from different servers (especially Google's AJAX Libraries API, e.g. jquery.min.js).
Any non-JavaScript solution would be well appreciated too though.
There might be some easier way, but you could build a test where javascript loads the element and you record the time. Then when the onload event fires compare the times. You would have to test to see what the exact difference between loading from cache and loading from the server is. Or for a whole lot of items have the javascript load first record the time. Then record the onload events of everything else as it loads onto the page. This may not be as accurate though.
Imagine that your web application maintains a hit counter for one or multiple pages and that it also aggressively caches those pages for anonymous visitors. This poses the problem that at least the hitcount would be out of date for those visitors because although the hitcounter is accurately maintained on the server even for those visitors, they would see the old cached page for a while.
What if the server would continue to serve them the cached page but would pass the updated counter in a non-persistent http cookie to be read by a piece of javascript in the page that would inject the updated counter into the DOM.
Opinions?
You are never going to keep track of the visitors in this manner. If you are aggressively caching pages, intermediate proxies and browsers are also going to cache your pages. And so the request may not even reach your server for you to track.
The best way to do so would be to use an approach similar to google analytics. When the page is loaded, send an AJAX request to the server. This ajax request would increment the current counter value on the server, and return the latest value. Then the client side could could show the value returned by the server using javascript.
This approach allows you to cache as aggressively as you want without losing the ability to keep track of your visitors.
you can also get the page programmatically via asp or php out the cache yourself and replace the hitcounter.
We have a heavy Ajax dependent application. What are the good ways of making it sure that the request to server side scripts are not coming through standalone programs and are through an actual user sitting on a browser
There aren't any really.
Any request sent through a browser can be faked up by standalone programs.
At the end of the day does it really matter? If you're worried then make sure requests are authenticated and authorised and your authentication process is good (remember Ajax sends browser cookies - so your "normal" authentication will work just fine). Just remember that, of course, standalone programs can authenticate too.
What are the good ways of making it sure that the request to server side scripts are not coming through standalone programs and are through an actual user sitting on a browser
There are no ways. A browser is indistinguishable from a standalone program; a browser can be automated.
You can't trust any input from the client side. If you are relying on client-side co-operation for any security purpose, you're doomed.
There isn't a way to automatically block "non browser user" requests hitting your server side scripts, but there are ways to identify which scripts have been triggered by your application and which haven't.
This is usually done using something called "crumbs". The basic idea is that the page making the AJAX request should generate (server side) a unique token (which is typically a hash of unix timestamp + salt + secret). This token and timestamp should be passed as parameters to the AJAX request. The AJAX handler script will first check this token (and the validity of the unix timestamp e.g. if it falls within 5 minutes of the token timestamp). If the token checks out, you can then proceed to fulfill this request. Usually, this token generation + checking can be coded up as an Apache module so that it is triggered automatically and is separate from the application logic.
Fraudulent scripts won't be able to generate valid tokens (unless they figure out your algorithm) and so you can safely ignore them.
Keep in mind that storing a token in the session is also another way, but that won't buy any more security than your site's authentication system.
I'm not sure what you are worried about. From where I sit I can see three things your question can be related to:
First, you may want to prevent unauthorized users from making a valid request. This is resolve by using the browser's cookie to store a session ID. The session ID needs to tied to the user, be regenerated every time the user goes through the login process and must have an inactivity timeout. Anybody request coming in without a valid session ID you simply reject.
Second, you may want to prevent a third party from doing a replay attacks against your site (i.e. sniffing an inocent user's traffic and then sending the same calls over). The easy solution is to go over https for this. The SSL layer will prevent somebody from replaying any part of the traffic. This comes at a cost on the server side so you want to make sure that you really cannot take that risk.
Third, you may want to prevent somebody from using your API (that's what AJAX calls are in the end) to implement his own client to your site. For this there is very little you can do. You can always look for the appropriate User-Agent but that's easy to fake and will be probably the first thing somebody trying to use your API will think of. You can always implement some statistics, for example looking at the average AJAX requests per minute on a per user basis and see if some user are way above your average. It's hard to implement and it's only usefull if you are trying to prevent automated clients reacting faster than human can.
Is Safari a webbrowser for you?
If it is, the same engine you got in many applications, just to say those using QT QWebKit libraries. So I would say, no way to recognize it.
User can forge any request one wants - faking the headers like UserAgent any they like...
One question: why would you want to do what you ask for? What's the diffrence for you if they request from browser or from anythning else?
Can't think of one reason you'd call "security" here.
If you still want to do this, for whatever reason, think about making your own application, with a browser embedded. It could somehow authenticate to the application in every request - then you'd only send a valid responses to your application's browser.
User would still be able to reverse engineer the application though.
Interesting question.
What about browsers embedded in applications? Would you mind those?
You can probably think of a way of "proving" that a request comes from a browser, but it will ultimately be heuristic. The line between browser and application is blurry (e.g. embedded browser) and you'd always run the risk of rejecting users from unexpected browsers (or unexpected versions thereof).
As been mentioned before there is no way of accomplishing this... But there is a thing to note, useful for preventing against CSRF attacks that target the specific AJAX functionality; like setting a custom header with help of the AJAX object, and verifying that header on the server side.
And if in the value of that header, you set a random (one time use) token you can prevent automated attacks.