Can web workers be restarted by the brower at any time?

Can web workers be restarted by the brower at any time? - javascript

I've read in many places (including here, in Stack Overflow) that web workers can be killed and restarted by the browser at any time. Well, probably "at any time" means "as long as they are doing something", but the thing is that they can be killed and restarted by the browser without prior warning, losing any data stored in globalThis.
But for the life of me I cannot find that in the specification, and it worries me because I'm actually using a Web Worker whose proper functioning relies in keeping some info data in a global variable to keep minimal state between calls of the message handling function.
This web worker works like a charm, and state is kept unless of course the page is refreshed, but I'm afraid that the app can fail if the browser decides to restart the web worker, and for sure it will if that happens.
I've googled about this, specially looking for examples and alternatives for rewriting my web worker without the need for that global state, but I haven't found anything relevant.
Can anyone point me to some official information about this?

I've read in many places (including here, in Stack Overflow) that web workers can be killed and restarted by the browser at any time.
Nope, they won't ever restart your (dedicated) Worker in any way.
The browser can kill a Worker if
The main navigable (a.k.a "page") is killed.
The Worker#terminate() method has been called.
The DedicatedWorkerGlobalScope#close() method has been called, and the current task is completed.
The Worker becomes "orphan". This may happen when it stops being a "protected worker", i.e.
when the Worker is a sub-Worker (itself created from a Worker) and its owner has been killed,
or when it has no scheduled tasks, no ongoing network request or database transaction, and no MessagePort or inner Worker objects susceptible of receiving messages from the outside.
So as long as you keep a reference to your Worker object in your main thread, and don't call yourself one of the closing methods, there is no way the Worker can be killed by the browser.
Note: The rules for SharedWorkers, ServiceWorkers, and other Worklets are all different, this answer treats only Dedicated Workers, created from new Worker().

Related

Working of Web Worker

I was reading about web workers, and I understood that it runs on a separate thread. One doubt I have is, whether the web worker spawns a new thread for every request sent to it. Example, if I have 2 js files wherein I share a webworker between two. Now when I postmessage from both files to web worker, will two threads be created or a single one ?

No, each Worker is a single thread, and they still use the same event loop mechanism as the main execution context; meaning, for example, if your Worker runs into an infinite loop, it will lock up completely and not react to any further messages.

What happens to JS ServiceWorker when I close the tab

When you close all tabs doing web worker, worker is shut down.
Does the same thing happens to service worker?

There are two relevant aspects to this:
Service worker registration, which is the record held in the browser to say "for this URL, these events should be handled by this script," and
Service worker activation, which is when your worker code is loaded in memory and either handling a request or waiting for one
A service worker remains registered across browser sessions (spec link). So if you exit the browser entirely (or even reboot the computer), the registration persists; the worker will get activated if you go to the relevant scope URL and an event it's registered to handle occurs.
A service worker is activated when it's needed to handle an event, and may be terminated the moment it has no events to handle (spec link). It's up to the host environment (the browser) when and whether it does that. The browser might keep the worker activated if there's no memory pressure or similar (even if all tabs using the worker's scope URL are closed), though that seems unlikely; it might be keep it in memory when there's still at least one tab using its scope URL even if the worker is currently idle; or it might terminate the worker the instant it doesn't have any requests to handle. It's up to the host environment to balance the cost of keeping the service worker in memory against the cost of starting it up when it's needed again.
It's a useful mental model to assume that the host will be really aggressive and terminate the worker the instant it isn't actively needed, which is why a previous version of this answer called them "extremely short lived." But whether that's literally true in any given host environment is up to the environment and its optimizations.

Long-running process inside a Service Worker (or something similar)

I have a client-side JS app that uses IndexedDB to store its state. Works fine. However, it's kind of slow because I am frequently reading from and writing to IndexedDB so that the state does not become inconsistent when multiple tabs are open.
My idea was... put all DB access stuff inside a Service Worker, and then I can cache values there in memory without worrying that another tab might have altered the database.
That seems to work fine, except some parts of my application take a long time to run. I can communicate the status (like "X% done") from the Service Worker to my UI. But both Firefox and Chrome seem to kill the worker if it runs for more than 30 seconds, which is way too short for me.
Is there any way around this limitation? If not, any ideas for achieving something similar? A Shared Worker could do it I think, except browser support is poor and I don't anticipate that improving now with the momentum behind Service Workers.

The Google documentation on service workers tells us that using service workers as a memory cache is impossible:
It's terminated when not in use, and restarted when it's next needed, so you cannot rely on global state within a service worker's onfetch and onmessage handlers. If there is information that you need to persist and reuse across restarts, service workers do have access to the IndexedDB API.
My suggestion is to keep using service workers to persist data to the database, and use localStorage to create a shared cache between pages. The tab that is making the change is then responsible for both updating the cache in localStorage and persisting to IndexedDB through the service worker.

I ended up using a Shared Worker. In browsers that don't support Shared Workers, such as Edge and Safari, I fall back to a Web Worker and some hacky code to only let you open the app in one tab at a time.
Since 90% of my users are on Firefox or Chrome anyway, I figure it's not a huge loss.
I also wrote a library to (amongst other things) normalize the APIs used in Shared Workers and Web Workers, so I have the exact same code running in both scenarios, the only difference is whether the worker is initialized with Worker or SharedWorker.

As you said yourself SharedWorkers seems to be exactly what you need.
I'm not sure why you think the momentum behind implementing ServiceWorkers prevent browsers from supporting SharedWorkers. They seem to be two different kind of beasts.
As far as I understand ServiceWorkers should be used as a proxy for your requests when your application is offline and heavy stuff should be done in WebWorkers and SharedWorkers
https://developer.mozilla.org/en-US/docs/Web/API/Service_Worker_API
https://developer.mozilla.org/en-US/docs/Web/API/SharedWorker

Service workers and page performance

I'm stuck at a wedding reception that I really don't want to be at and I'm driving, so obviously I'm reading about service workers. I'm on my phone so can't play about with anything but was thinking if they're a viable option for improving page performance?
Images are the biggest killer on my site and I'm half thinking we could use a service worker to cache them to help get page load times down. From what I can tell, the browser still makes the http request, it's just the response is from the SW cache, not the file location. Am I missing something here? Is there therefore any actual benefit to doing this?

While the regular http cache has a lot of overlap with ServiceWorker cache, one thing that the former can't handle very well is the dynamically generated html used in many client-side javascript applications.
Even when all the resources of the app are cache hits, there is still the delay as the javascript is compiled and executed before the app is usable.
Addy Osmani has demonstrated how ServiceWorker can be used to cache the Shell of an app. When the DOM is modified on the client, it is updated in the cache. The next time that URL is requested, the ServiceWorker replies with html that is ready for use before the app has booted.
The other advantage regards lie-fi: when it seems the network is available, but not enough packets are getting through. ServiceWorkers can afford to have a near-imperceptible timeout, because they can serve immediately from cache and wait for the response to load (if ever).

Your consideration is invalid.
Service worker is designed to work like a proxy server that can especially handle some off-page operations like offline ability, push notification, background synchronization, etc. So in your case, you will gain no performance benefits by caching images with service worker over the traditional browser's cache approach.

web socket connection closed when behind proxy

I've a web sockets based chat application (HTML5).
Browser opens a socket connection to a java based web sockets server over wss.
When browser connects to server directly (without any proxy) everything works well.
But when the browser is behind an enterprise proxy, browser socket connection closes automatically after approx 2 minutes of no-activity.
Browser console shows "Socket closed".
In my test environment I have a Squid-Dansguardian proxy server.
IMP: this behaviour is not observed if the browser is connected without any proxy.
To keep some activity going, I embedded a simple jquery script which will make an http GET request to another server every 60 sec. But it did not help. I still get "socket closed" in my browser console after about 2 minutes of no action.
Any help or pointers are welcome.
Thanks

This seems to me to be a feature, not a bug.
In production applications there is an issue related with what is known as "half-open" sockets - see this great blog post about it.
It happens that connections are lost abruptly, causing the TCP/IP connection to drop without informing the other party to the connection. This can happen for many different reasons - wifi signals or cellular signals are lost, routers crash, modems disconnect, batteries die, power outages...
The only way to detect if the socket is actually open is to try and send data... BUT, your proxy might not be able to safely send data without interfering with your application's logic*.
After two minutes, your Proxy assume that the connection was lost and closes the socket on it's end to save resources and allow new connections to be established.
If your proxy didn't take this precaution, on a long enough timeline all your available resources would be taken by dropped connections that would never close, preventing access to your application.
Two minutes is a lot. On Heroku they set the proxy for 50 seconds (more reasonable). For Http connections, these timeouts are often much shorter.
The best option for you is to keep sending websocket data within the 2 minute timeframe.
The Websocket protocol resolves this issue by implementing an internal ping mechanism - use it. These pings should be sent by the server and the browser responds to them with a pong directly (without involving the javascript application).
The Javascript API (at least on the browser) doesn't let you send ping frames (it's a security thing I guess, that prevents people from using browsers for DoS attacks).
A common practice by some developers (which I think to be misconstructed) is to implement a JSON ping message that is either ignored by the server or results in a JSON pong.
Since you are using Java on the server, you have access to the Ping mechanism and I suggest you implement it.
I would also recommend (if you have control of the Proxy) that you lower the timeout to a more reasonable 50 seconds limit.
* The situation during production is actually even worse...
Because there is a long chain of intermediaries (home router/modem, NAT, ISP, Gateways, Routers, Load Balancers, Proxies...) it's very likely that your application can send data successfully because it's still "connected" to one of the intermediaries.
This should start a chain reaction that will only reach the application after a while, and again ONLY if it attempts to send data.
This is why Ping frames expect Pong frames to be returned (meaning the chain of connection is intact.
P.S.
You should probably also complain about the Java application not closing the connection after a certain timeout. During production, this oversight might force you to restart your server every so often or experience a DoS situation (all available file handles will be used for the inactive old connections and you won't have room for new connections).

check the squid.conf for a request_timeout value. You can change this via the request_timeout. This will affect more than just web sockets. For instance, in an environment I frequently work in, a perl script is hit to generate various configurations. Execution can take upwards of 5-10 minutes to complete. The timeout value on both our httpd and the squid server had to be raised to compensate for this.
Also, look at the connect_timeout value as well. That's defaulted to one minute..

Develop Reference

JavaScript is the programming language of the Web.