Why is node.js not suitable for heavy CPU apps? - javascript

Node.js servers are very efficient concerning I/O and large number of client connection. But why is node.js not suitable for heavy CPU apps in comparison to a traditional multithreading server?
I read it here Felix Baumgarten

Node is, despite its asynchronous event model, by nature single threaded. When you launch a Node process, you are running a single process with a single thread on a single core. So your code will not be executed in parallel, only I/O operations are parallel because they are executed asynchronous. As such, long running CPU tasks will block the whole server and are usually a bad idea.
Given that you just start a Node process like that, it is possible to have multiple Node processes running in parallel though. That way you could still benefit from your multithreading architecture, although a single Node process does not. You would just need to have some load balancer in front that distributes requests along all your Node processes.
Another option would be to have the CPU work in separate processes and make Node interact with those instead of doing the work itself.
Related things to read:
Node.js and CPU intensive requests
Understanding the node.js event loop

A simple Node.js server is single-threaded, meaning that any operation that takes a long time to execute will block the rest of your program from running. Node.js apps manage to maintain a high level of concurrency by working as a series of events. When an event handler is waiting for something to happen (such as reading from the database), it tells Node to go ahead and process another event in the meantime. But since a single thread can only execute one instruction at a time, this approach can't save you from a function that needs to keep actively executing for a long time. In a multithreaded architecture, even if one function takes a long time to compute the result, other threads can still process other requests — and as long as you have a core that is not fully used at the time, there's a good chance they can do it about as quickly as if no other requests were running at all.
In order to deal with this, production Node.js apps that expect to hog a lot of CPU will usually be run in clusters. This means that instead of having several threads in one program's memory space, you run several instances of the same program under the control of one "master" instance. Each process is single-threaded, but since you have several of them, you end up gaining the benefits of multiple threads.

Node is flawless if you are having asynchronous tasks because java script will run these things by worker pool. But if you run CPU intense tasks (where you heavily use CPU ) Ex you have a billion users and you want to sort those people on name. Its quit a Intense tasks, and this is synchronous which will block other code from running.
So its not a good idea to use node for these kind of applications. Technically you can find alternatives to address those kind of tasks. The above example is better addressed in a Db. then passing that result is great.
In the same way avoid Intense task and keep your CPU cool for better performance

You can have a look at this package, the-computer, which may help you do some cpu intensive works in a single instance of node.js app in a simple way.
Definitely it is not as effective as raw c++ libs, but it can cover most general computing cases, keeping you in node.js garden while allowing you leverage the cores of the cup.

Node.js runs JavaScript code in a single thread, which means that your code can only do one task at a time. However, Node.js itself is multithreaded and provides hidden threads through the libuv library, which handles I/O operations like reading files from a disk or network requests. Through the use of hidden threads, Node.js provides asynchronous methods that allow your code to make I/O requests without blocking the main thread.
Although Node.js has hidden threads, you cannot use them to offload CPU-intensive tasks, such as complex calculations, image resizing, or video compression. Since JavaScript is single-threaded when a CPU-intensive task runs, it blocks the main thread and no other code executes until the task completes. Without using other threads, the only way to speed up a CPU-bound task is to increase the processor speed.
💡 Node.js introduced the worker-threads module, which allows you to create threads and execute multiple JavaScript tasks in parallel. Once a thread finishes a task, it sends a message to the main thread that contains the result of the operation so that it can be used with other parts of the code. The advantage of using worker threads is that CPU-bound tasks don’t block the main thread and you can divide and distribute a task to multiple workers to optimize it.
ref: https://www.digitalocean.com/community/tutorials/how-to-use-multithreading-in-node-js

Related

How javascript handles multiple requests being Singlethreaded? [duplicate]

I don't understand several things about nodejs. Every information source says that node.js is more scalable than standard threaded web servers due to the lack of threads locking and context switching, but I wonder, if node.js doesn't use threads how does it handle concurrent requests in parallel? What does event I/O model means?
Your help is much appreciated.
Thanks
Node is completely event-driven. Basically the server consists of one thread processing one event after another.
A new request coming in is one kind of event. The server starts processing it and when there is a blocking IO operation, it does not wait until it completes and instead registers a callback function. The server then immediately starts to process another event (maybe another request). When the IO operation is finished, that is another kind of event, and the server will process it (i.e. continue working on the request) by executing the callback as soon as it has time.
So the server never needs to create additional threads or switch between threads, which means it has very little overhead. If you want to make full use of multiple hardware cores, you just start multiple instances of node.js
Update
At the lowest level (C++ code, not Javascript), there actually are multiple threads in node.js: there is a pool of IO workers whose job it is to receive the IO interrupts and put the corresponding events into the queue to be processed by the main thread. This prevents the main thread from being interrupted.
Although Question is already explained before a long time, I'm putting my thoughts on the same.
Node.js is single threaded JavaScript runtime environment. Basically it's creator Ryan Dahl concern was that parallel processing using multiple threads is not the right way or too complicated.
if Node.js doesn't use threads how does it handle concurrent requests in parallel
Ans: It's completely wrong sentence when you say it doesn't use threads, Node.js use threads but in a smart way. It uses single thread to serve all the HTTP requests & multiple threads in thread pool(in libuv) for handling any blocking operation
Libuv: A library to handle asynchronous I/O.
What does event I/O model means?
Ans: The right term is non-blocking I/O. It almost never blocks as Node.js official site says. When any request goes to node server it never queues the request. It take request and start executing if it's blocking operation then it's been sent to working threads area and registered a callback for the same as soon as code execution get finished, it trigger the same callback and goes to event queue and processed by event loop again after that create response and send to the respective client.
Useful link:
click here
Node JS is a JavaScript runtime environment. Both browser and Node JS run on V8 JavaScript engine. Node JS uses an event-driven, non-blocking I/O model that makes it lightweight and efficient. Node JS applications uses single threaded event loop architecture to handle concurrent clients. Actually its' main event loop is single threaded but most of the I/O works on separate threads, because the I/O APIs in Node JS are asynchronous/non-blocking by design, in order to accommodate the main event loop. Consider a scenario where we request a backend database for the details of user1 and user2 and then print them on the screen/console. The response to this request takes time, but both of the user data requests can be carried out independently and at the same time. When 100 people connect at once, rather than having different threads, Node will loop over those connections and fire off any events your code should know about. If a connection is new it will tell you .If a connection has sent you data, it will tell you .If the connection isn’t doing anything ,it will skip over it rather than taking up precision CPU time on it. Everything in Node is based on responding to these events. So we can see the result, the CPU stay focused on that one process and doesn’t have a bunch of threads for attention.There is no buffering in Node.JS application it simply output the data in chunks.
Though its been answered , i would like to just share my understandings in simple terms
Nodejs uses a library called Libuv , so this Libuv is written in C
language which uses the concept of threads . These threads are called
as workers and these workers take care of the multiple requests from client.
Parallel processing in nodejs is achieved with the help of 2 concepts
Asynchronous
Non blocking IO

Matching users flow Nodejs

So i'm trying to create a system which users can match each other by specific information,
the flow i have in mind is as follows:
user 1 fills the information and clicks "find"
at the same time user 2 does the same as user 1
the client sends a request to the server in route /X so the server can push the client to a (threadsafe)queue
a worker thread pulls out from the queue each time and do the matching
meanwhile the user polls route /Y in the server to get his match
the worker thread finds 2 users match and pushes it to some (threadsafe)data structure
next time the user polls the server(in /Y), the user gets the match and is redirected to the conversation
so first of all is this a good approach?
and also is using a worker thread and threadsafe datastructure logical in javascript?(specifically Nodejs and express) is there an alternative or a better way to do this kind of stuff?
thanks.
This is a bad approach.
You do not need (and should not use) worker threads for your use case.
On Worker Threads
Worker Threads are isolated instances of Javascript which run as a separate thread. They are intended strictly for performing CPU-intensive work.
vs vanilla Node
But you don't need them, because Node libraries are asynchronous, which means that unless your code really is CPU-intensive, you won't see any benefit from using Worker Threads (in fact there is overhead to using them, so if they aren't needed, your code will run slower).
From the docs: "Workers (threads) are useful for performing CPU-intensive JavaScript operations. They will not help much with I/O-intensive work. Node.js’s built-in asynchronous I/O operations are more efficient than Workers can be."
More on Threadedness
Javascript is single-threaded, and works very well that way. There is no concept of "threadsafe" in Javascript, because it isn't needed; all code is threadsafe.
If you do have CPU-intensive code
If you're doing expensive regex matching, then you are right to want to run this code in parallel. Worker Threads might not be the best way to do this, though.
Splitting CPU-intensive code into separate programs is often the most flexible solution. It gives you several options:
spawn a new instance of Node run your CPU-intensive code (on the same server)
run your CPU-intensive code in the cloud on "serverless" services, such AWS Lambda
turn your CPU-intensive code into a "microservice", essentially a tiny webserver which does any specialized processing and returns the result
Further Reading
How Node's asynchronicity works (big picture)
https://blog.insiderattack.net/event-loop-and-the-big-picture-nodejs-event-loop-part-1-1cb67a182810
What kinds of operations block the event loop and how to avoid it
https://nodejs.org/uk/docs/guides/dont-block-the-event-loop/

Are HTML5 Web Workers threads or processes?

From the Mozilla documentation:
Web Workers is a simple means for web content to run scripts in
background threads.
Considering Javascript is single-threaded, are web workers separate threads or processes? Is there shared memory that classifies them as threads?
They run in background threads, but the API completely abstracts from the implementation, so you may come across a browser that just schedules them to run on the same thread as other events like Node does. Processes are too heavyweight to run background tasks.
Considering Javascript is single-threaded
JavaScript is not single-threaded.
The main part of a JavaScript program runs on an event loop.
Long-running processes (XMLHttpRequest being the classic example) are almost always farmed out to stuff that runs outside the event loop (often on different threads).
Web Workers are just a means to write JavaScript that runs outside the main event loop.
are web workers separate threads or processes? Is there shared memory that classifies them as threads?
That's an implementation detail of the particular JS engine.
As per the MDN:-
The Worker interface spawns real OS-level threads, and mindful programmers may be concerned that concurrency can cause “interesting” effects in your code if you aren't careful.
Reference:- https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers#about_thread_safety
The documentation does not define whether the web worker runs in a separate thread or process (or another similar construct). So, depending on the hardware architecture of the processor on which the program is executed, the Operating System and the implementation of the JavaScript engine used, it may be different.
However, I guess that the essence of this question is: Can the Operating System use multiple CPU cores by using web workers? If so, the answer is: YES!!! Even regardless of the implementation of the JavaScript engine!
As long as the processor has many cores, and the Operating System can make use of them, even if the Web Worker's script is executed within another thread of the same process, these threads will be able to run on different cores because the "process" is a construct of an Operating System and itself can run on several processor cores, just as several processes can run on a single core.
P.S. If you want the code to be executed 100% in another process, delegate it to another service (e.g. running on a different server).

What are workers in NodeJS

I've primarily programmed in other programming languages but I have been making a webapp in user NodeJS and have come across a few things that I can't quite get my heard around.
I referred to https://nodejs.org/api/cluster.html#cluster_how_it_works
I found that this explained, well, how NodeJS can cope with large numbers of requests despite Node only being single threaded. However, what confuses me is when it says a port is shared among 'many workers'.
Now if Node is not multithreaded then what exactly are these workers. In java for example you can have multithreaded applications using Completable Futures for example. These cause different threads to take responsibility.
But what is a worker in node if not a thread?
Node can easily handle 10,000 concurrent connections in a single thread (see this answer for details). For some things that are blocking it uses a thread pool but this is transparent to you. Your JavaScript uses a single-threaded event loop in every process.
Keep in mind that nginx, a web server that is known for speed is also single-threaded. Redis, a database that is known for speed is also single-threaded. Multi-threading is good for CPU-bound tasks (when you use one thread per CPU) but for I/O-bound tasks that Node is usually used for, single-threaded event loops work better.
Now, to answer your question - in the context of clusters that the website that you linked to is talking about, a worker is a single process. Every one of those processes still has one single-threaded event loop but there can be many of those processes executing at the same time.
See those answers for more details:
Which would be better for concurrent tasks on node.js? Fibers? Web-workers? or Threads?
what is mean by event loop in node.js ? javascript event loop or libuv event loop?
How many clients can an http-server can handle?

Node.js - single thread, non-blocking?

I am learning Node.js and I have read that Node.js is single threaded and non-blocking.
I have a good background in JavaScript and I do understand the callbacks, but what I don't really understand is how Node.js can be single threaded and run code in the background. Isn't that contradictory?
Because if Node.js is single threaded it can still only perform one task at the time. So if it runs something in the background it has to stop the current task to process something in the background, right?
How does that work practically?
What "in the background" really means in terms of NodeJS is that things get put on a todo list for later. Whenever Node is done with what it's doing it picks from the top of the todo list. This is why doing anything that actually IS blocking can wreck your day. Everything that's happening "in the background" (actually just waiting on the todo list) gets stopped until the blocking task is complete.
Lucas explained it well, but I would like to add, this is possible to add "nodes" via some cluster libraries if you want to take advantage of your processors.
https://www.npmjs.com/package/cluster
https://www.npmjs.com/package/pm2
A tutorial to do a cluster: http://blog.carbonfive.com/2014/02/28/taking-advantage-of-multi-processor-environments-in-node-js/
Some hosters will give your the 'scalability' options, like Heroku
Anyway, when you use MongoDB with NodeJS (via Mongoose for example), it creates multiples connections.
NOTE: The advantage to be monothreaded is to handle millions users. With a legacy multithreaded server (apache), you create a thread for EACH user, then you need really BIG servers to handle thousands people.
While the JavaScript engine is monothreaded, there are multiple threads "in the background" that deal with all the non-blocking I/O work.
Specifically, libuv has a pool of worker threads waiting on OS events, I/O signals, running C++ code, etc. Size of this pool is determined by the UV_THREADPOOL_SIZE environment variable.
No JavaScript code ever runs "in the background". JavaScript functions (i.e. callbacks) are scheduled to run later on the main event loop, either by other JS functions or directly by the libuv workers. If the loop is blocked, then everything scheduled has to wait for it.
In fact, Node.js is not exactly monothreaded. Node.js use one "main thread", which is the thread where you script is executed. This main thread must never be blocked. So long-running operations are executed in separate threads. For example, Node.js use libuv library which maintains a pool of threads used to perform I/O.

Categories

Resources