Proper headers for pseudo static JavaScript file

Proper headers for pseudo static JavaScript file - javascript

I have a python 2.7 app on Google Appengine. One of the JS files is served via a python script, not a standard static handler. The app.yaml config is shown below:
- url: /js/foo.js
script: python.js.write_javascript.app
secure: optional
The request for foo.js is part of a code snippet clients, of our service, place on their website, so it can't really be updated. python.js.write_javascript.app basically just reads in a JS template file, substitutes in a few customer specific values and prints to the browser.
What I'm wondering is, how do we set the correct headers so this request is cached correctly. Without any custom headers, appengine's default is to tell the browser never to cache this. This is obviously undesirable because it creates unnecessary load on our app.
Ideally, I would like to have browsers make a new request only when the template has been updated. Another option would be to cache per session.
Thanks

Well
It looks like Google handles this automatically. I just print it, using the correct JavaScript headers but without any cache headers and Google's CDN caches it for me. I'm not sure what the default cache lifetime is but I saw no increase in instances or cost by implementing this.
It seems Google just takes care of it for me.

Related

Managing browser cache

I'm building a web application. I'm linking to separate css and js files and I want to manage cache.
If the script js or the style css file have been updated then force reload and replace that file, else get the file from cache.
Is that possible? How to do that?

As a default css and javascript file cashed in client browser. when you update your css or javascript file only need update and increase version in HTML header like this:
foo.css?ver=002
foo.js?ver=002

This depends a lot on the server, as caching in browsers is based on a set of headers sent by the server, including Cache-Control, Expires, Etag, and the way it handles headers from the client, including If-Modified-Since, and If-None-Match. This allows the browser to try to request that the server return a file only if it doesn't already have the latest version; if, based on headers, the server determines that the browser already has the latest version, it can return a response of 304 not modified.
You can also use "cache buster" query parameters as sia suggested: add a query parameter to the file name, which will be ignored by most servers, but which you can use to indicate that there is a new version of the same file. While the query parameter won't let you control what version is downloaded, it will be part of the key in the browser cache, so when the parameter changes, the browser will download the file again.
There is an excellent rundown of how HTTP caching works over on MDN.

Loading a JS file from a CDN, how to prevent browser caching?

Every few hours (or less it depends) I will be updating a javascript file that will contain a JSON object that contains notifications that will appear on the website. This JSON object is what I will be updating periodically and so it cannot be cached by browsers.
The javascript will be hosted on a CDN, and this file will be on client websites like:
<script src="//example.com/1.7.1/my_file.js"></script>
How can I possible prevent browser caching in this type of situation?
I guess the best way would be for all clients to have the same javascript file, and then make an ajax request to pull down the messages? This way the "my_file.js" can be cached, but the ajax response will not be.

You can add a timestamp after the path my_file.js?time=currentTime
To keep the HTML clean you can add something like this in another local js file
$("head").append('<script type="text/javascript" src="yourPath?time=' + Date.now() + '"></script>');
Or you can use $.getScript, requirejs, etc.

It really depends on what CDN you use and the features available. Some approaches:
Instruct the CDN to cache the file for around an hour, so it'll automatically fetch a new one every hour (assuming someone is requesting). This means that users won't necessarily get the change immediately, but would be within half an hour of you changing
Along with that, Have the CDN send the end users a "no-cache" Cache-Control header. Depending on the CDN, this would simply be what your origin sends and you use some an override on the CDN (or one of the various means of instructing the CDN to cache it via headers, such as Surrogate-Cache-Control)
Alternatively, the above-mentioned versioning/timestamped URL. The downside of this is that it requires you to update any references to your file to include the version id. If your pages are auto-generated this is fine.
Allow the CDN to cache the file indefinitely, but make use of a "purge" or "invalidation" feature to force the CDN to request the new version the next time it is requested. This would mainly apply if you would have a reason to have the file not update for more than a few hours or a day.

Why won't the client receive new versions of this script in the public folder?

In my project there is a public folder and a script inside it: public/worker.js, which contains a piece of code:
alert('foo');
I call this script using a Worker:
new Worker('worker.js');
I launch Meteor and connect to my app. foo is alerted.
If I change the public/worker.js code to anything else:
alert('bar');
The server refreshes the clients, the client refreshes the page but won't get the new code, instead using the old one (alerting foo instead of the new shiny bar). Clearing the cache then refreshing fixes the issue. CTRL+F5 does not fix this cache issue, it does not seem to work for this kind of script call (at least not on the version of Firefox I tested it with).
Why is this happening, exactly?
How can I prevent it?

You should alter the response header for the file. Maybe this gets you going: Explicit HTTP Response Headers for files in Meteor's public directory

The script is cached and the browser does not pull the new version from the server.
We need to edit the header of the requests for the files in the /workers folder, using the following code server-side (I wrapped it in a package with api.use('webapp')):
WebApp.rawConnectHandlers.use('/workers', function(req, res, next) {
res.setHeader('cache-control', 'must-revalidate');
next();
});
Using WebApp.connectHandlers did not work, the callback was never called, so I used rawConnectHandlers instead.
I am not 100% sure it is the best way to go, but it works.

I've not found exactly why, but browsers (at least Chrome) seem to treat worker scripts differently to other Javascript files with regards to caching on refresh of the page, even if the headers sent from the server are the same. Refreshing the page makes the browser check for new scripts referenced in script tags, but not those used as a worker.
The way I've fixed this is that at build time, I include a version number/build time/md5 of the file contents in the file name, so it will end up something like worker.12333.js. The advantage of this is that if each filename references a file that essentially immutable, you can set far-future expires headers... So instead of telling the browser to never cache the worker script, it can cache it forever. https://github.com/felthy/grunt-cachebuster is one such tool that does this for Javascript included via script tags, but there are probably others.
The issue with this is that there must be some mechanism to tell the Javascript the updated filename, so it knows to call new Worker('worker.12333.js');. I'm not sure if existing available tools handle that, but the way I do it is to just use the project build time in seconds as the unique key for all the files
<html build-time="12333">
...
and then access it via Javascript so it can work out the latest worker script filename. It's not perfect, but it's fairly simple. You can probably come up with other mechanisms depending on your requirements.

Force browser to reload all cache after site update

Is there a way to force the clients of a webpage to reload the cache (i.e. images, javascript, etc) after a server has been pushed an update to the code base? We get a lot of help desk calls asking why certain functionality no longer works. A simple hard refresh fixes the problems as it downloads the newly updated javascript file.
For specifics we are using Glassfish 3.x. and JSF 2.1.x. This would apply to more than just JSF of course.
To describe what behavior I hope is possible:
Website A has two images and two javascript files. A user visits the site and the 4 files get cached. As far as I'm concerned, no need to "re-download" said files unless user specifically forces a "hard" refresh or clears their cache. Once a site is pushed an update to one of the files, the server could have some sort of metadata in the header informing the client of said update. If the client chooses, the new files would be downloaded.
What I don't want to do is put meta-tag in the header of a page to force nothing from ever being cached...I just want something that tells the client an update has occurred and it should get the latest once something has been updated. I suppose this would just be some sort of versioning on the client side.
Thanks for your time!

The correct way to handle this is with changing the URL convention for your resources. For example, we have it as:
/resources/js/fileName.js
To get the browser to still cache the file, but do it the proper way with versioning, is by adding something to the URL. Adding a value to the querystring doesn't allow caching, so the place to put it is after /resources/.
A reference for querystring caching: http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.9
So for example, your URLs would look like:
/resources/1234/js/fileName.js
So what you could do is use the project's version number (or some value in a properties/config file that you manually change when you want cached files to be reloaded) since this number should change only when the project is modified. So your URL could look like:
/resources/cacheholder${project.version}/js/fileName.js
That should be easy enough.
The problem now is with mapping the URL, since that value in the middle is dynamic. The way we overcame that is with a URL rewriting module that allowed us to filter URLs before they got to our application. The rewrite watched for URLs that looked like:
/resources/cacheholder______/whatever
And removed the cacheholder_______/ part. After the rewrite, it looked like a normal request, and the server would respond with the correct file, without any other specific mapping/logic...the point is that the browser thought it was a new file (even though it really wasn't), so it requested it, and the server figures it out and serves the correct file (even though it's a "weird" URL).
Of course, another option is to add this dynamic string to the filename itself, and then use the rewrite tool to remove it. Either way, the same thing is done - targeting a string of text during rewrite, and removing it. This allows you to fool the browser, but not the server :)
UPDATE:
An alternative that I really like is to set the filename based on the contents, and cache that. For example, that could be done with a hash. Of course, this type of thing isn't something you'd manually do and save to your project (hopefully); it's something your application/framework should handle. For example, in Grails, there's a plugin that "hashes and caches" resources, so that the following occurs:
Every resource is checked
A new file (or mapping to this file) is created, with a name that is the hash of its contents
When adding <script>/<link> tags to your page, the hashed name is used
When the hash-named file is requested, it serves the original resource
The hash-named file is cached "forever"
What's cool about this setup is that you don't have to worry about caching correctly - just set the files to cache forever, and the hashing should take care of files/mappings being available based on content. It also provides the ability for rollbacks/undos to already be cached and loaded quickly.

i use a no-cache parameter for this situations...
a have a string constant value like (from config file)
$no_cache = "v11";
and in pages, i use assets like
<img src="a.jpg?nc=$no_cache">
and when i update my code, just change the $no_cache value, and it works like a charm.

getting related js files: What is the point of adding ?t=B48E5AB

I'm using CKEditor which is a multi-file library so the main js file calls other js and css files. I'm noticing that after the main file is called, additional files have a ?t=CODE added to them, so something like this, but the actual files don't have that extra ?t=B49E5BQ at the end.
http://site.com/ckeditor/config.js?t=B49E5BQ
http://site.com/ckeditor/extra.js?t=B49E5BQ
What's the point of this
P.S. Please feel free to add additional tags, because I'm not sure about this one.

This sort of trailing data is sometimes put into URLs for resources files like scripts/stylesheets so as to prevent caching of resources across re-deployments.
Whenever you change a resource, you change the code in HTML files/templates which require that resource, so that clients re-request the resource from the server the next time they load the page.

I would guess that the URL parameter is added to bypass any caching mechanisms. When a client sees the same URL with a different query parameter, that usually means the client can't use the cached version of the resource (in this case a JS file) and go to the server to fetch the latest version.
In HTTP, if a URL is the same in every way except for the URL parameters, a client can not assume that those files/resources are the same resulting object.
Which means:
http://site.com/ckeditor/config.js?t=B49E5BQ
is not the same as:
http://site.com/ckeditor/config.js?t=1234

It must be there to prevent caching.

I do this occasionally for images and script files. In my case, it's a meaningless argument (usually datetime) that just forces the browser to fetch a new copy every time.
If the parameter keeps changing, those files won't be cached on the client side.

Often this is easier than say, changing the name of the file to include a version number (jquery-1.6.2.js works nicely, but do you want to rename config.js to config-1.0.js, -2.0, etc. every time you make a change?
Like all the other answers, this simply forces the browser to grab the latest version when the querystring (?t=B49E5BQ) is changed. In our case, we simply add the date (?06022011).

Develop Reference

JavaScript is the programming language of the Web.