Browser cache - versioning files - but what if browser uses old version?

Browser cache - versioning files - but what if browser uses old version? - javascript

Say I have a JavaScript SPA that loads one file X_version1.js into the browser, and that file will be run. What is preventing X_version1.js from accidentally calling a function in X_version0.js that was cached in the browser at an earlier time? Are cached .js scripts not invoked unless they are explicitly invoked and under what circumstances could that happen?
This doesn't seem that far-fetched, because sometimes we use cached versions of jQuery, or whatever, which might not be inside the .js file that is loaded in the most recent server request...

The caching doesn't happen at the method level it happens at the file level.
So if you have
<script src="X_version0.js"></script>
Then the browser goes "ah I've downloaded that before! I'll just return the cached version." But when you change your source to:
<script src="X_version1.js"></script>
The browser hasn't seen that file before and goes and fetches it. At this point nothing in the page says load version0 so the browser doesnt

Related

Embedded appendChild JS files not re-fetched on browser hard refresh

I have a web page with multiple embedded JS files, almost all of which are inserted using the normal <script src="/blah.js?v=20200414"> tag. They all have a cache-busting query string, based on the date they were created.
There is one script which is loaded using the following
<script>
window.setTimeout(loadScriptElm, 2000);
function loadScriptElm()
{
var scriptElm = document.createElement("script");
scriptElm.src = "/special-script.js?v=20200414";
document.getElementsByTagName("body")[0].appendChild(scriptElm);
}
</script>
All the scripts have a Cache-Control for 900 seconds. The code in all of them runs just fine.
When I initially load the page, all the scripts are retrieved from the server. When I do a refresh (using F5), all the scripts are loaded from browser cache (assuming the cache TTL hasn't expired).
When I do a 'hard refresh' (using Ctrl+F5), all the 'regularly-embedded' scripts are re-fetched from the server (as I would expect), but the special-script.js file is not - it's still retrieved from browser cache. Why doesn't a hard refresh also re-fetch this file?
I can reliably recreate this on Chrome/Brave, but I haven't tried other browsers. It seems like a bug, but maybe it's not... Is this 'expected behavior'? If so, is it documented anywhere?

How does "?v=" in HTML linking work?

I have been developing a website for testing new stuff, and I need to figure out the "?v=" thing. But I have no clue how it works, so can someone explain this to me please haha? Like how to, and how it works.
So what would this look like and how would the file names on the server vary for this:
<script src="assets/js/moticulous.js"></script>
<link rel="stylesheet" href="assets/js/platforms.css"/>
as opposed to this:
<script src="assets/js/moticulous.js?v=1"></script>
<link rel="stylesheet" href="assets/js/platforms.css?v=1"/>

This can be added to prevent Caching of js/css/image files. By adding ?anything=123 You force browser/client to download the updated version of js/css/image file from the server.
Read more on: https://css-tricks.com/can-we-prevent-css-caching/

That is a technique used to control caching of script, css and image files.
The browser will download the script file with the ?v=1 parameter (example"http://example.com/path/to/script.js?v=1") and cache it to the visitors disk. The next time the browser visits the page, if the URL is still "http://example.com/path/to/script.js?v=1" then the cached version will be loaded.
If you change the ?v=1 to ?v=2 then the cached version is no longer valid as the full URL is no longer the same as what the browser has cached. This results in a new file being downloaded, and cached. This forces recent changes to every visitor regardless of cache settings set at the server config or browser.
This technique is often used with a version number (likely why its a v=) to force a new download of the js when the software version gets updated.
In your backend code, you would replace the =1 part with whatever the current software version is to make this cache control dynamic. Alternately, you could increment the version number whenever the asset changes but that's less dynamic or more work to make it so.

The dummy HTTP GET string is passed to prevent caching as some browsers cache the .js and .css files. It is usually done to prevent the older version of the file from loading by the browsers via browser cache when a change is made to the .css or .js file. Adding the timestamp value to the name (as <filename>?<timestamp>) is more popular than adding the version as it forces the browser to download the files every time the page is viewed as no two request times have the same timestamp.

Does sw-precache activation of new service worker guarantees cache busting?

I am using sw-precache along with sw-toolbox to allow offline browsing of cached pages of an Angular app.
The app is served through a node express server.
One problem we ran into is that the index.html sometimes doesn't seem to be updated in the cache although other assets has been updated on activation of new service worker.
This leaves users with an outdated index.html that is trying to load no longer existing versioned asset in this case /scripts/a387fbeb.modules.js.
I am not entirely sure what's happening, because it seems that on different browsers where the index.html has been correctly updated have the same hash.
On one browser outdated (problematic) Index.html
(cached with 2cdd5371d1201f857054a716570c1564 hash) includes:
<script src="scripts/a387fbeb.modules.js"></script>
in its content. (this file no longer exists in the cache or on remote).
On another browser updated (good) index.html
(cached with the same 2cdd5371d1201f857054a716570c1564) includes:
<script src="scripts/cec2b711.modules.js"></script>
These two have the same cache, although the content that is returned to the browsers are different!
What should I make of this? Does this mean that sw-precache doesn't guarantee atomic cache busting when new SW activates? How can one protect from this?
If these help, this is the generated service-worker.js file from sw-precache.
Note: I realize I can use remoteFirst strategy (at least for index.html) to avoid this. But I'd still like to understand and figure out a way to use cacheFirst strategy to get the most out of performance.
Note 2: I saw in other related questions that one can change the name of the cache to force bust all the old cache. But this seems to beat the idea of sw-precache only busting updated content? Is this the way to go?
Note 3: Note that even if I hard reload the browser where the website is broken. The site would work because it would skip service worker cache but the cache would still be wrong - the service worker doesn't seem to activate - my guess because this specific SW has been activated already but failed at busting the cache correctly. Subsequent non-hard-refresh visits would still see the broken index.html.

(The answers here are specific to the sw-precache library. The details don't apply to service workers in general, but the concepts about cache maintenance may still apply to a wider audience.)
If the content of index.html is dynamically generated by a server and depends on other resources that are either inlined or referenced via <script> or <link> tags, then you need to specify those dependencies via the dynamicUrlToDependencies option. Here's an example from the app-shell-demo that ships as part of the library:
dynamicUrlToDependencies: {
'/shell': [
...glob.sync(`${BUILD_DIR}/rev/js/**/*.js`),
...glob.sync(`${BUILD_DIR}/rev/styles/all*.css`),
`${SRC_DIR}/views/index.handlebars`
]
}
(/shell is used there instead of /index.html, since that's the URL used for accessing the cached App Shell.)
This configuration tells sw-precache that any time any of the local files that match those patterns change, the cache entry for the dynamic page should be updated.
If your index.html isn't being generated dynamically by the server, but instead is updated during build time using something like this approach, then it's important to make sure that the step in your build process that runs sw-precache happens after all the other modifications and replacements have taken place. This means using something like run-sequence to ensure that the service worker generation isn't run in parallel with other tasks.
If the above information doesn't help you, feel free to file a bug with more details, including your site's URL.

Are JavaScript files downloaded with the HTML-body

I have a Java Web Application, and I'm wondering if the javascript files are downloaded with the HTML-body, or if the html body is loaded first, then the browser request all the JavaScript files.
The reason for this question is that I want to know if importing files with jQuery.getScript() would result in poorer performance. I want to import all files using that JQuery function to avoid duplication of JavaScript-imports.

The body of the html document is retrieved first. After it's been downloaded, the browser checks what resources need to be retrieved and gets those.
You can actually see this happen if you open Chrome Dev Console, go to network tab (make sure caching is disabled and logs preserved) and just refresh a page.
That first green bar is the page loading and the second chunk are the scripts, a stylesheet, and some image resources

The HTML document is downloaded first, and only when the browser has finished downloading the HTML document can it find out which scripts to fetch
That said, heavy scripts that don't influence the appearance of the HTML body directly should be loaded at the end of the body and not in the head, so that they do not block the rendering unless necessary

I'm wondering if the javascript are downloaded with the html body during a request
If it's part of that body then yes. If it's in a separate resource then no.
For example, suppose your HTML file has this:
<script type="text/javascript">
$(function () {
// some code here
});
</script>
That code, being part of the HTML file, is included in the HTML resource. The web server doesn't differentiate between what kind of code is in the file, it just serves the response regardless of what's there.
On the other hand, if you have this:
<script type="text/javascript" src="someFile.js"></script>
In that case the code isn't in the same file. The HTML is just referencing a separate resource (someFile.js) which contains the code. In that case the browser would make a separate request for that resource. Resulting in two requests total.

The HTML document is downloaded first, or at least it starts to download first. While it is parsed, any script includes that the browser finds are downloaded. That means that some scripts may finish loading before the document is completely loaded.
While the document is being downloaded, the browser parses it and displays as much as it can. When the parsing comes to a script include, the parsing stops and the browser will suspend it until the script has been loaded and executed, then the parsing continues. That means that
If you put a call to getScript instead of a script include, the behaviour will change. The method makes an asynchronous request, so the browser will continue parsing the rest of the page while the script loads.
This has some important effects:
The parsing of the page will be completed earlier.
Scripts will no longer run in a specific order, they run in the order that the loading completes.
If one script is depending on another, you have to check yourself that the first script has actually loaded before using it in the other script.
You can use a combination of script includes and getScript calls to get the best effect. You can use regular scripts includes for scripts that other scripts depend on, and getScript for scripts that are not affected by the effects of that method.

Why Chrome seems requesting Javascript files asynchronously?

Given that my HTML page requests 3 script files:
<script src="a.js"></script>
<script src="b.js"></script>
<script src="c.js"></script>
<link rel="stylesheet" type="text/css" href="d.css"/>
Until now my understanding was that browser requests a.js and document is not parsed any further until a.js is downloaded and executed (default behaviour without async attribute on script tag).
But now looking at Chrome developer tools, Timeline tab with Loading box checked, I look at and got something along these lines:
Send Request (this is request for HTML page itself)
Receive data
Receive data
Receive data (this are pockets coming if I understand correctly)
Finish Loading (here page is loaded now)
Parse HTML (now browser starts reading actual HTML code)
Send Request (a.js)
Now, in here I would expect the browser to halt everything and wait till a.js will be fully downloaded, then executed and only then start parsing HTML further and reading b.js and the rest)
I have however this output further:
Send Request b.js (Why is this happening, a.js is not finished loading yet)
Send Request c.js (again why?)
Send Request d.css
Only then at some point I get:
Finish Loading a.js
Browser executes a.js and only then other files (if already downloaded) are executed in order of SCRIPT tags in html.
Now my question is how browser knows to request b.js, c.js and css file if the a.js wasn't fully downloaded and executed? Is Chrome applying some magick to optimise and speed up process? What if a.js when run modifies and removes the rest of SCRIPT tags? Is the request wasted then? Or am I simply reading output wrong?

A: Because each of the JS files are specified in the HTML. The browser tries to batch-up as many downloads as it can to save time. From the moment you right-click on something and select "save link as" it's being downloaded - even though you haven't specified a file name yet. If you then hit cancel instead of specifying a name/location, you have wasted whatever was already downloaded.
Chrome applies a similar policy with regards to resources in a page. If a.js removes any/all subsequent script tags, then yes - their download has been in vain.
The alternative is to have choppy performance.

If there are multiple things to download, browsers will normally download several of them in parallel. The exact number of parallel downloads can vary by browser.
Note: The CSS should be before the JS to get maximum benefit from parallel downloads.

Develop Reference

JavaScript is the programming language of the Web.