Reasons to move the WebSockets into a Worker Thread [closed]

Reasons to move the WebSockets into a Worker Thread [closed] - javascript

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I am running some tests with WebSockets.
For the test I used the Alchemy-Websockets .NET based server.
The web application open several windows and it used to monitor different services and systems.
I am especially interested in high load situation where the server has to sends a lot
of events to the client, to reflect real time updates. I want the GUI to be fully responsive and present the data in a grid and a chart in a real time user experience.
I created the WebSocket in the main window thread, and on every incoming message I added an entry to an array that the grid is using to display (SlickGrid). To make the GUI work fine I added an setInterval of 20ms to render the grid updates, everything is working fine, very fast.
The question is whether moving the WebSocket to a worker thread is desirable or recommended. Reading about Worker threads I saw in the use cases a recommendation to handle I/O in a thread.
I think this make sense only if it is blocking.
As far as I know WebSocket is asynchronous and does not block. I read somewhere that
it is implemented in a thread internally by the browser, which makes sense.
I consider moving the WebSocket into a worker, allowing the worker to buffer or aggregate some data before moving it to the main window, In case of high event rate I see the following approaches:
The main window thread polls the worker periodically (every 20ms or so) and get the required data.
The worker sends larger chunks of data periodically.
Every time the web socket receive data, send it to the main thread - but I think it introduce the same inherent problems. (This is where I began testing, I created an infinite loop in a worker thread
and on every step I sent a message to the main thread, the GUI froze
which makes sense).
Leaving the WebSocket on the main thread is also not ideal. In case of a high load from the server, the GUI will not prioritize the WebSocket incoming message events.
Gathering data in the worker thread, seems I might miss the real time updates during high loads, since the worker is buffering.
Another issue with worker threads seem to be the data duplication, which can be solved by the newer transferable objects, not sure how well it is supported on all browsers yet.
Why not hosting the WebSocket on the main window?
So what is the best practice?

There are only two reasons to move WebSocket to Worker.
JSON.parse is blocking operation and can cause FPS loss in case of big data. Data cloning from worker adds ~1% overhead, but 99% of parsing is done in background thread.
Using SharedWorker to maintain only one connection with server.

DOM manipulation and probably drawing on canvas affects websocket packages too. I measure around 0.2ms average latency on localhost, but using the same thread I have 4-9ms spikes in regular intervals. If I don't update the DOM that frequent or move the websocket to a worker they disappear. Chrome is affected the most, Firefox is better from this perspective currently. For online gaming this can mean some sort of lag or stuttering. For me it just affects TCP latency testing somewhat. I need to increase the message sending frequency from 1ms to 100ms to get a clean chart.

Related

How can I cancel a node thread prematurely from my frontend? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 13 days ago.
Improve this question
I have a react web app which generates solutions for rubik's cubes. When the user makes a query on my site, it starts a long computation process (anywhere from 1 second - 240 seconds). Every time a solution is found, the state is changed and the user can see the new solution.
However, this app often crashes on mobile for large queries, I believe the browser is demanding too much memory and kills my page. Because of this, I want to add a node.js backend to handle the computation.
I would like for the following functionality:
When the user makes a query request, it sends that to the backend which can start computing. Every so often, the frontend can update to show the current tally of solutions.
If the user prematurely wants to cancel the process, they can do so, also killing the backend thread.
How can I set this up? I know I can very easily make HTTP requests to my backend and receive a response when it is done. However, I'm not sure how to accomplish the dynamic updating as well as how to cancel a running thread. I have heard of long polling but I'm not sure if this is the right tool for the job, or if there is a better method.
I would also like this app to support multiple people trying to use it at the same time, and I'm not sure if I need to consider that in the equation as well.
Any help is appreciated!

However, I'm not sure how to accomplish the dynamic updating. I have heard of long polling but I'm not sure if this is the right tool for the job, or if there is a better method.
Three main options:
A webSocket or socket.io connection from client to server and the server can then send updates.
Server-sent events (SSE is another way for the server to send updates to the client)
Client polls the http server on some time interval to get a regular progress report.
as well as how to cancel a running thread
If by "thread" here, you mean a WorkerThread in nodejs, then there are a couple of options:
From your main nodejs process, you can send the thread a message tell it to exit. You would have to program whatever processing you're doing in the thread to be able to respond to incoming messagessto that it will receive that message from the parent and be able to act on it. A solution like this allows for an orderly shut-down by the thread (it can release any resources it may have opened).
You can call worker.terminate() from the parent to proactively just kill the thread.
Either of these options can be triggered by the client sending a particular http request to your server that identifies some sort of ID so the server's main thread can tell which workerThread it should stop.
I would also like this app to support multiple people trying to use it at the same time, and I'm not sure if I need to consider that in the equation as well.
This means you'll have to program your nodejs server such that each one of these client/thread combinations has some sort of ID such that you can associate one with the other and can have more than one pair in operation at once.

Duplicate websocket subscription in Azure webapp

I have a node.js app running in Azure as a webApp. On startup it connects to an external service using a websocket subscription. Specifically I'm using the reconnecting-websockets NPM package to wrap it to handle disconnects.
The problem I am having is that because there are 2 instances of the app running on Azure (horizontal scaling for failover) I end up with two subscriptions at any one time.
Is there an obvious way to solve this problem?
For extra context, this is a problem for 2 reasons:
I pay for each message received and am over quota
When messages are received I process then and do database updates, these are also being duplicated.

You basically want to have an AppService with potentially multiple instances, but you don't want your application to run in parallel. At least you don't want two have two subscriptions. Ideally you don't want to touch your application code.
An easy way to implement this would be to wrap your application into a continuous WebJob, and set its scale property to singleton.
Here is one tutorial on how to set up a nodejs webjob: https://morshemesh.medium.com/continuous-deployment-of-web-jobs-with-node-js-2308f95e63b1
You can then use a settings.job file to control that your webjob only runs on a single instance at any one time. Or you can use the Azure Portal to set the value when you manually deploy the Webjob.
{
"is_singleton": true
}
https://github.com/projectkudu/kudu/wiki/WebJobs-API#set-a-continuous-job-as-singleton
PS: Don't forget to enable Always On. It is also mentioned in the docs. But you probably already need that for your current deployment.

If you don't want your subscription to be duplicated then it stands to reason that you only want one process subscribing to the external websocket connection.
Since you mentioned that messages received will be updated in the db, then it makes sense that this would be an isolated backend process since you made it clear that you have multiple instances running for the frontend server (and whether or not a separate backend).
Of course if you want more redundancy, you could use a load balancer with simple distribution of messages to any number of instances behind. Perhaps some persistent queueing system if you feel that it's needed.
If you want these messages to be propagated to the client (not clear from the question), this will be a bit more annoying. If it's a one-way simple channel, then you could consider using SSE which is a rather simple protocol. If it's bilateral then I would myself probably consider running a STOMP server with intermediary broker (like RabbitMq) and connect directly from the client (i.e. the browser, not the server generating the frontend) to the service.
Not sure if you're well versed with Java, but I made some app that you could use for reference in case interested when we had to prepare some internal demos: https://github.com/kimgysen/zwoop-backend/tree/develop/stomp-api/src/main/java/be/zwoop
For all intents and purposes, I'm not sure if all this is worth the hustle for you, it sounds like you're on a tight budget and that you're looking for simple solutions without too much complexity. Have you considered giving up on load balancing the website (is the load really that high?), I don't have enough background knowledge on your project to judge, I believe. But proper caching optimization and initially scaling vertically may be sufficient at the start (?).
Personally I would start simple and gradually increase complexity when needed.
I'm just throwing ideas at you, hopefully it is helpful in any way to have a few considerations.
Btw, I don't understand why other answers on this question were all deleted (?).

When do you want more/less http requests? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
It seems like to have you page load fast, you would want a series of small http requests.
If it was one big one, the user might have to wait much longer to see that the page was there at all.
However, I'v heard that minimizing your HTTP requests is more efficient. For example, this is why sprites are created for multiple images.
Is there a general guideline for when you want more and when you want less?

Multiple requests create overhead from both the connection and the headers.
Its like downloading the contents of an FTP site, one site has a single 1GB blob, another has 1,000,000 files totalling a few MB. On a good connection, the 1GB file could be downloaded in a few minutes, but the other is sure to take all day because the transfer negotiation ironically takes more time that the transfer itself.
HTTP is a bit more efficient than FTP, but the principle is the same.
What is important is the initial page load, which needs to be small enough to show some content to the user, then load additional assets outside of the user's view. A page with a thousand tiny images will benefit from a sprite always because the negotiations would not only cause strain to the connection, but also potentially the client computer.

EDIT 2 (25-08-2017)
Another update here; Some time has passed and HTTP2 is (becoming) a real thing. I suggest reading this page for more information about it.
Taken from the second link (at the time of this edit):
It is expected that HTTP/2.0 will:
Substantially and measurably improve end-user perceived latency in
most cases, over HTTP/1.1 using TCP. Address the "head of line
blocking" problem in HTTP.
Not require multiple connections to a server to enable parallelism,
thus improving its use of TCP, especially regarding congestion
control.
Retain the semantics of HTTP/1.1, leveraging existing documentation
(see above), including (but not limited to) HTTP methods, status
codes, URIs, and where appropriate, header fields.
Clearly define how HTTP/2.0 interacts with HTTP/1.x, especially in
intermediaries (both 2->1 and 1->2).
Clearly identify any new extensibility points and policy for their
appropriate use.
The bold sentence (emphasis mine) explains how HTTP2 will handle requests differently from HTTP1. Whereas HTTP1 will create ~8 (differs per browser) simultaneous (or "parallel") connections to fetch as much resources as possible, HTTP2 will re-use the same connection. This reduces overall time and network latency required to create a new connection which in turn, speeds up asset delivery. Additionally, your webserver will also have an easier time keeping ~8 times less connections open. Imagine the gains there :)
HTTP2 is also already quite widely supported in major browsers, caniuse has a table for it :)
EDIT (30-11-2015)
I've recently found this article on the topic 'page speed'. this post is very thorough and it's an interesting read at worst so I'd definitely give it a shot.
Original
There are too many answers to this question but here's my 2cents.
If you want to build a website you'll need few basic things in your tool belt like HTML, CSS, JS - maybe even PHP / Rails / Django (or one of the 10000+ other web frameworks) and MySQL.
The front-end part is basically all that gets sent to the client every request. The server-sided language calculates what needs to be sent which is how you build your website.
Now when it comes to managing assets (images, CSS, JS) you're diving into HTTP land since you'll want to do as few requests as possible. The reason for this is that there is a DNS penalty.
This DNS penalty however does not dictate your entire website of course. It's all about the balance between amount of requests and read- / maintainability for the programmers building the website.
Some frameworks like rails allow you to combine all your JS and CSS files into a big meta-like JS and CSS file before you deploy your application on your server. This ensures that (unless done otherwise) for instance ALL the JS and ALL the CSS used in the website get sent in one request per file.
Imagine having a popup script and something that fetches articles through AJAX. These will be two different scripts and when deploying without combining them - each page load including the popup and article script will send two requests, one for each file respectively.
The reason this is not true is because browsers cache whatever they can whenever they can because in the end browsers and people who build websites want the same thing. The best experience for our users!
This means that during the first request your website will ever answer to a client will cache as much as possible to make consecutive page loads faster in the future.
This is kind of like the browser way of helping websites become faster.
Now when the brilliant browserologists think of something it's more or less our job to make sure it works for the browser. Usually these sorts of things with caching etc are trivial and not hard to implement (thank god for that).
Having a lot of HTTP requests in a page load isn't an end-of-the-world thing since it'll only slow your first request but overall having less requests makes this "DNS-penalty" thing appear less often and will give your users more of an instant page load.
There are also other techniques besides file-merging that you could use to your advantage, when including a javascript you can choose it to be async or defer.
For async it means the script will be loaded and executed in the background whenever it's loaded, regardless of order of inclusion within HTML. This also pauses the HTML parser to execute the script directly.
For defer it's a bit different. It's kind of like async but files will be executed in the correct order and only after the HTML parser is done.
Something you wouldn't want to be "async" would be jQuery for instance, it's the key library for a lot of websites and you'll want to use it in other scripts so using async and not being sure when it's downloaded and executed is not a good plan.
Something you would want to be "async" is a google analytics script for instance, it's effectively optional for the end-user and thus should be labelled as not important - no matter how much you care about the stats your website isn't built for you but by you :)
To get back to requests and blend all this talk about async and deferred together, you can have multiple JS on your page for instance and not have the HTML parser pause to execute some JS - instead you can make this script defer and you'll be fine since the user's HTML and CSS will load while the JS parser waits nicely for the HTML parser.
This is not an example of reducing HTTP requests but it is an example of an alternative solution should you have this "one file" that doesn't really belong anywhere except in a separate request.
You will also never be able to build a perfect website, nor will http://github.com or http://stackoverflow.com but it doesn't matter, they are fast enough for our eyes to not see any crazy flashing content and those things are truly important for end-users.
If you are curious about how much requests is normal - don't. It's different for every website and the purpose of the website, tho I agree some things do go over the top sometimes but it is what it is and all we have to do is support browsers like they are supporting us - Even looking at IE / Edge there since they are also improving (slowly but steady anyways).
I hope my story made sense to you, I did re-read before the post but couldn't find anything while scouting for irregular typing or other kinds of illogical things.
Good luck!

The HTTP protocol is verbose, so the ratio of header size to payload size makes it more efficient to have a larger payload. On top of that, this is still a distributed communication which makes it inherently slow. You also, usually, have to set up and tear down the TCP connection for each request.
Also, I have found, that the small requests repeat data between themselves in an attempt to achieve RESTful purity (like including user data in every response).
The only time small requests are useful is when the data may not be needed at all, so you only load it when needed. However, even then it may be more performant to.simply retrieve it all in one go.

You always want less requests.
The reason we separate any javascript/css code in other files is we want the browser to cache them so other pages on our website will load faster.
If we have a single page website with no common libraries (like jQuery) it's best if you include all the code in your html.

Node.js: blocking threads until an event is emitted [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I am trying to make a simple server that has some heavy load processing. I am using cluster module to take this heavy load to threads and let Node make it's magic. The problem is: as I don't know how to build a thread pool, I'm afraid I might run into some trouble with pid limit.
The question: Provided that I can't change process identifier OS limit, how can I create a thread with Node that does not die (a thread that waits for a message, processes it and then waits for another message) WITHOUT using busy waiting (I want them to block while waiting for a new request)?

I can't get what you're asking for, but node cluster meant to spawn a constant number of workers (usually one per CPU core) to enable multithreading processing of your request.
Each worker works in a single thread, consuming one and only one pid. All workers share the same TCP connections, allowing requests to be distributed between them. Each worker process all request, dispatched to id, asynchronously (all at the same time) in its single thread.
Node.js designed to utilize all resources of a single CPU core by processing all incoming requests asynchronously, meaning you don't need more than numCPUs workers to utilize all your resources.
So, I cant't understand your problem with pid limit.
If you have problems configuring your cluster, see this answer and this blog post.
Summarizing my answer, properly configured cluster consists of require('os').cpus().length worker processes handling requests and one master process watching them and respawning dead ones.

What are the use-cases for Web Workers? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I am looking for real-world scenarious for using Web Workers API.

John Resig (of jQuery fame) has a bunch of interesting examples of using web workers here - games, graphics, crypto.
Another use is Web I/O - in other words, polling URLs in background. That way you don't block the UI waiting for polling results.
Another practical use: in Bespin, they’re using Web Workers to do the syntax highlighting, which you wouldn’t want to block your code editing whilst you’re using the app.
From Mozilla: One way workers are useful is to allow your code to perform processor-intensive calculations without blocking the user interface thread.
As a practical example, think of an app which has a large table of #s (this is real world, BTW - taken from an app I programmed ~2 years ago). You can change one # in a table via input field and a bunch of other numbers in different columns get re-computed in a fairly intensive process.
The old workflow was: Change the #. Go get coffee while JavaScript crunches through changes to other numbers and the web page is unresponsive for 3 minutes - after I optimized it to hell and back. Get Back with coffee. Change a second #. Repeat many times. Click SAVE button.
The new workflow with the workers could be: Change the #. Get a status message that something is being recomputed but you can change other #s. Change more #s. When done changing, wait till status changes to "all calculations complete, you can now review the final #s and save".

I have used them for sending larger amounts of data from the browser to server. Obviously, you can do this with regular AJAX calls, but if this takes up one of the precious connections per hostname. Also, if the user does a page transition during this process (e.g clicks a link), your JavaScript objects from the previous page go away and you can't process callbacks. When a web worker is used, this activity happens out of band, so you have a better guarantee that it will complete.

Another Use case:
Compressing/De-compressing files in the background, if you have a lot of images and other media files that are exchanged from the server in compressed format.

Develop Reference

JavaScript is the programming language of the Web.