Do Timers run on their Own threads in Node.js? - javascript

I am a bit confused here I know Javascript is a single-threaded language but while reading about the event loop. I got to know that in case of setTimeout or setInterval javascript calls web API provided by the browser which spawns a new thread to execute timer on that thread. but what happens in the case of node.js environment with timers how do they execute/work?

No threads are used for timers in node.js.
Timers in node.js work in conjunction with the event loop and don't use a thread. Timers in node.js are stored in a sorted linked list with the next timer to fire at the start of the linked list. Each time through the event loop, it checks to see if the first timer in the linked list has reached its time. If so, it fires that timer. If not, it runs any other events that are waiting in the event loop.
On each subsequent cycle through the event loop, it keeps checking to see if its time for the next timer or not. When a new timer is added, it is inserted into the linked list in its proper sorted order. When it fires or is cancelled, it is removed from the linked list.
If the event loop has nothing to do, it may sleep for a brief period of time, but it won't sleep past the timer for the next timer.
Other references on the topic:
How does nodejs manage timers internally
Libuv timer code in nodejs
How many concurrent setTimeouts before performance issues?
Multiple Client Requests in NodeJs
Looking for a solution between setting lots of timers or using a scheduled task queue

Node runs on a single thread but asynchronous work happens elsewhere. For example, libuv provides a pool of 4 threads that it may use, but wont if there's a better option.
The node documentation says
Node.js runs JavaScript code in the Event Loop (initialization and callbacks), and offers a Worker Pool to handle expensive tasks like file I/O. Node.js scales well, sometimes better than more heavyweight approaches like Apache. The secret to the scalability of Node.js is that it uses a small number of threads to handle many clients. If Node.js can make do with fewer threads, then it can spend more of your system's time and memory working on clients rather than on paying space and time overheads for threads (memory, context-switching). But because Node.js has only a few threads, you must structure your application to use them wisely.
A more detailed look at the event loop

No. Timers are just scheduled on the same thread and will call their callbacks when the time expires.
Depending on what OS your are on and what javascript interpreters you use they will use various APIs form poll to epoll to kqueue to overlapped I/O on Windows but in general asynchronous APIs have similar features. So let's ignore platform differences and look at a cross-platform API that exists on all OSes: the POSIX select() system call.
The select function in C looks something like this:
int select(int nfds, fd_set *readfds, fd_set *writefds,
fd_set *exceptfds, struct timeval *timeout);
Where nfds is total number of file descriptors (including network sockets) you are waiting/listening on, readfds is the list/set of read file descriptors you are waiting on, writefds is the list/set of write file descriptors, exceptfds is the list/set of error file descriptors (think stderr) and timeval is the timeout for the function.
This system call blocks - yes, in non-blocking, asynchronous code there is a piece of blocking system call. The main difference between non-blocking code and blocking threaded code is that the entire program blocks in only one place, the select() function (or whatever equivalent you use).
This function only returns if any of the file descriptors have activity on them or if the timeout expires.
By managing the timeout and calculating the next value of timeval you can implement a function like setTimeout
I've written much deeper explanations of how this works in answers to the following related questions:
I know that callback function runs asynchronously, but why?
Event Queuing in NodeJS
how node.js server is better than thread based server
Node js architecture and performance
Performance of NodeJS with large amount of callbacks
Does javascript process using an elastic racetrack algorithm
Is there any other way to implement a "listening" function without an infinite while loop?
I recommend you at least browse each of the answers I wrote above because they are almost all non-duplicates. They sometimes overlap but explain different aspects of asynchronous code execution.
The gist of it is that javascript does not execute code in parallel to implement timers. It doesn't need to. Instead it waits in parallel. Once you understand the difference between running code in parallel and waiting (doing nothing) in parallel you will understand how things like node.js achieve high performance and how events work better.

Related

Javascript/Nodejs how readFile is implemented [duplicate]

So I have an understanding of how Node.js works: it has a single listener thread that receives an event and then delegates it to a worker pool. The worker thread notifies the listener once it completes the work, and the listener then returns the response to the caller.
My question is this: if I stand up an HTTP server in Node.js and call sleep on one of my routed path events (such as "/test/sleep"), the whole system comes to a halt. Even the single listener thread. But my understanding was that this code is happening on the worker pool.
Now, by contrast, when I use Mongoose to talk to MongoDB, DB reads are an expensive I/O operation. Node seems to be able to delegate the work to a thread and receive the callback when it completes; the time taken to load from the DB does not seem to block the system.
How does Node.js decide to use a thread pool thread vs the listener thread? Why can't I write event code that sleeps and only blocks a thread pool thread?
Your understanding of how node works isn't correct... but it's a common misconception, because the reality of the situation is actually fairly complex, and typically boiled down to pithy little phrases like "node is single threaded" that over-simplify things.
For the moment, we'll ignore explicit multi-processing/multi-threading through cluster and webworker-threads, and just talk about typical non-threaded node.
Node runs in a single event loop. It's single threaded, and you only ever get that one thread. All of the javascript you write executes in this loop, and if a blocking operation happens in that code, then it will block the entire loop and nothing else will happen until it finishes. This is the typically single threaded nature of node that you hear so much about. But, it's not the whole picture.
Certain functions and modules, usually written in C/C++, support asynchronous I/O. When you call these functions and methods, they internally manage passing the call on to a worker thread. For instance, when you use the fs module to request a file, the fs module passes that call on to a worker thread, and that worker waits for its response, which it then presents back to the event loop that has been churning on without it in the meantime. All of this is abstracted away from you, the node developer, and some of it is abstracted away from the module developers through the use of libuv.
As pointed out by Denis Dollfus in the comments (from this answer to a similar question), the strategy used by libuv to achieve asynchronous I/O is not always a thread pool, specifically in the case of the http module a different strategy appears to be used at this time. For our purposes here it's mainly important to note how the asynchronous context is achieved (by using libuv) and that the thread pool maintained by libuv is one of multiple strategies offered by that library to achieve asynchronicity.
On a mostly related tangent, there is a much deeper analysis of how node achieves asynchronicity, and some related potential problems and how to deal with them, in this excellent article. Most of it expands on what I've written above, but additionally it points out:
Any external module that you include in your project that makes use of native C++ and libuv is likely to use the thread pool (think: database access)
libuv has a default thread pool size of 4, and uses a queue to manage access to the thread pool - the upshot is that if you have 5 long-running DB queries all going at the same time, one of them (and any other asynchronous action that relies on the thread pool) will be waiting for those queries to finish before they even get started
You can mitigate this by increasing the size of the thread pool through the UV_THREADPOOL_SIZE environment variable, so long as you do it before the thread pool is required and created: process.env.UV_THREADPOOL_SIZE = 10;
If you want traditional multi-processing or multi-threading in node, you can get it through the built in cluster module or various other modules such as the aforementioned webworker-threads, or you can fake it by implementing some way of chunking up your work and manually using setTimeout or setImmediate or process.nextTick to pause your work and continue it in a later loop to let other processes complete (but that's not recommended).
Please note, if you're writing long running/blocking code in javascript, you're probably making a mistake. Other languages will perform much more efficiently.
So I have an understanding of how Node.js works: it has a single listener thread that receives an event and then delegates it to a worker pool. The worker thread notifies the listener once it completes the work, and the listener then returns the response to the caller.
This is not really accurate. Node.js has only a single "worker" thread that does javascript execution. There are threads within node that handle IO processing, but to think of them as "workers" is a misconception. There are really just IO handling and a few other details of node's internal implementation, but as a programmer you cannot influence their behavior other than a few misc parameters such as MAX_LISTENERS.
My question is this: if I stand up an HTTP server in Node.js and call sleep on one of my routed path events (such as "/test/sleep"), the whole system comes to a halt. Even the single listener thread. But my understanding was that this code is happening on the worker pool.
There is no sleep mechanism in JavaScript. We could discuss this more concretely if you posted a code snippet of what you think "sleep" means. There's no such function to call to simulate something like time.sleep(30) in python, for example. There's setTimeout but that is fundamentally NOT sleep. setTimeout and setInterval explicitly release, not block, the event loop so other bits of code can execute on the main execution thread. The only thing you can do is busy loop the CPU with in-memory computation, which will indeed starve the main execution thread and render your program unresponsive.
How does Node.js decide to use a thread pool thread vs the listener thread? Why can't I write event code that sleeps and only blocks a thread pool thread?
Network IO is always asynchronous. End of story. Disk IO has both synchronous and asynchronous APIs, so there is no "decision". node.js will behave according to the API core functions you call sync vs normal async. For example: fs.readFile vs fs.readFileSync. For child processes, there are also separate child_process.exec and child_process.execSync APIs.
Rule of thumb is always use the asynchronous APIs. The valid reasons to use the sync APIs are for initialization code in a network service before it is listening for connections or in simple scripts that do not accept network requests for build tools and that kind of thing.
Thread pool how when and who used:
First off when we use/install Node on a computer, it starts a process among other processes which is called node process in the computer, and it keeps running until you kill it. And this running process is our so-called single thread.
So the mechanism of single thread it makes easy to block a node application but this is one of the unique features that Node.js brings to the table. So, again if you run your node application, it will run in just a single thread. No matter if you have 1 or million users accessing your application at the same time.
So let's understand exactly what happens in the single thread of nodejs when you start your node application. At first the program is initialized, then all the top-level code is executed, which means all the codes that are not inside any callback function (remember all codes inside all callback functions will be executed under event loop).
After that, all the modules code executed then register all the callback, finally, event loop started for your application.
So as we discuss before all the callback functions and codes inside those functions will execute under event loop. In the event loop, loads are distributed in different phases. Anyway, I'm not going to discuss about event loop here.
Well for the sack of better understanding of Thread pool I a requesting you to imagine that in the event loop, codes inside of one callback function execute after completing execution of codes inside another callback function, now if there are some tasks are actually too heavy. They would then block our nodejs single thread. And so, that's where the thread pool comes in, which is just like the event loop, is provided to Node.js by the libuv library.
So the thread pool is not a part of nodejs itself, it's provided by libuv to offload heavy duties to libuv, and libuv will execute those codes in its own threads and after execution libuv will return the results to the event in the event loop.
Thread pool gives us four additional threads, those are completely separate from the main single thread. And we can actually configure it up to 128 threads.
So all these threads together formed a thread pool. and the event loop can then automatically offload heavy tasks to the thread pool.
The fun part is all this happens automatically behind the scenes. It's not us developers who decide what goes to the thread pool and what doesn't.
There are many tasks goes to the thread pool, such as
-> All operations dealing with files
->Everyting is related to cryptography, like caching passwords.
->All compression stuff
->DNS lookups
This misunderstanding is merely the difference between pre-emptive multi-tasking and cooperative multitasking...
The sleep turns off the entire carnival because there is really one line to all the rides, and you closed the gate. Think of it as "a JS interpreter and some other things" and ignore the threads...for you, there is only one thread, ...
...so don't block it.

Multithreading javascript

I want to create a real thread which manages some operations in javascript.
After several search, i found 'Web Workers', 'setTimeout' or 'setInterval'.
The problem is that 'Web Workers' don't have access to global variables and therefore can't modify my global arrays directly (or i do not know how).
'setTimeout' is not really what i need.
'setInterval' sets my problem, however it is probably that after many times my operations could last longer. Therefore i am afraid that two interval overlaps.
Finally i need a infinite loop which executes a series of operations once after another. Does it exist or do I have to content myself with 'setInterval'? Is there an alternative with jQuery or other? If it is not, is what I can expect in the near future to see the developer make it available?
I'm going to assume you're talking about in a web browser.
JavaScript in web browsers has a single main UI thread, and then zero or more web worker threads. Web workers are indeed isolated from the main UI thread (and each other) and so don't have access to globals (other than their own). This is intentional, it makes both implementing the environment and using it dramatically simpler and less error-prone. (Even if that isolation weren't enforced, it's good practice for multi-threaded programming anyway.) You send messages to, and receive messages from, web workers via postMessage and the message event.
JavaScript threads (the main UI thread and any web workers) work via a thread-specific task queue (aka "job queue"): Anything that needs to happen on a JavaScript thread (the initial run of the code when a page loads, handling of an event, timer callbacks [more below]) adds a task to the queue. The JavaScript engine runs a loop: Pick up the next task, run it, pick up the next, run it, etc. When there are no tasks, the thread goes quiet waiting for a task to arrive.
setTimeout doesn't create a separate thread, it just schedules a task (a call to a callback) to be added to the task queue for that same thread after a delay (the timeout). Once the timeout occurs, the task is queued, and when the task reaches the front of the queue the thread will handle it.
setInterval does exactly what setTimeout does, but schedules a recurring callback: Once the timeout occurs, it queues the task, then sets up another timeout to queue the task again later. (The rules around the timing are a bit complex.)
If you just want something to recur, forever, at intervals, and you want that thing to have access to global variables in the main UI thread, then you either:
Use setInterval once, which will set up recurring calls back to your code, or
Use setTimeout, and every time you get your callback, use setTimeout again to schedule the next one.
From your description, it sounds as though you may be calling setInterval more than once (for instance, on each callback), which quickly bogs down the thread as you're constantly telling it to do more and more work.
The last thing is easy: webworker start their work when they get a message to (onmessage) and sit idle otherwise. (that's highly simplified, of course).
Global variables are not good for real multi-threading and even worse with the reduced thing JavaScript offers. You have to rewrite your workers to work standalone with only the information given.
Subworkers have a messaging system which you might be able to make good use of.
But the main problem with JavaScript is: once asynchronous always asynchronous. There is no way to "join" threads or a "wait4" or something similar. The only thing that can do both is the XMLHttprequest, so you can do it over a webserver but I doubt the lag that causes would do any good. BTW: synchronous XMLHttprequest is deprecated says Mozilla which also has a page listing all of the way where a synchronous request is necessary or at least very useful.

How does JavaScript's Single Threaded Model handle time consuming tasks?

This question is regarding the sinlge threaded model of JavaScript. I understand that javascript is non-block in nature cause of its ability to add a callbacks to the async event queue. But if the callback function does infact take a long time to complete, won't JavaScript then be blocking everything else during that time as it is single threaded? How does nodejs handle such a problem? And is this an unavoidable problem for developers on the front end? I'm asking this question cause I have read that its generally good practice to keep function tasks as small as possible. Is it really because long tasks in javascript will actually block other tasks?
But if the callback function does infact take a long time to complete, won't JavaScript then be blocking everything else during that time as it is single threaded?
Yes.
How does nodejs handle such a problem?
Node.js handles nothing. How you handle concurrency is up to you and your application. Now, Node.js does have a few tools available to you. The first thing you have to understand is that Node.js is basically V8 (JavaScript engine) with a lightweight library split between JavaScript and native code bolted on. While your JavaScript code is single-threaded by nature, the native code can and does create threads to handle your work.
For example, when you ask Node.js to load a file from disk, your request is passed off to native code where a thread pool is used, and your data is loaded from disk. Once your request is made, your JavaScript code continues on. This is the meaning of "non-blocking" in the context of Node.js. Once that file on disk is loaded, the native code passes it off to the Node.js JavaScript library, which then executes your callback with the appropriate parameters. Your code continued to run while the background work was going on, but when your callback is dealing with that data, other JavaScript code is indeed blocked from running.
This architecture allows you to get much of the benefit of multithreaded code without having to actually write any multithreaded code, keeping your application straightforward.
I'm asking this question cause I have read that its generally good practice to keep function tasks as small as possible. Is it really because long tasks in javascript will actually block other tasks?
My philosophy is always to use what you need. It's true that if a request comes in to your application and you have a lot of JavaScript processing of data that is blocking, other requests will not be processed during this time. Remember though that if you are doing this sort of work, you are likely CPU bound anyway and doing double the work will cause both requests to take longer.
In practice, the majority of web applications are IO bound. They shuffle data from a database, reformat it, and send it out over the network. The part where they handle data is actually not all that time consuming when compared to the amount of time the application is simply waiting to hear back from the upstream data source. It is in these applications where Node.js really shines.
Finally, remember that you can always spawn child processes to better distribute the load. If your application is that rare application where you do 99% of your work load in CPU-bound JavaScript and you have a box with many CPUs and/or cores, split the load across several processes.
Your question is a very large one, so I am just going to focus on one part.
if the callback function does infact take a long time to complete, won't JavaScript then be blocking everything else during that time as it is single threaded? (...) Is it really because long tasks in javascript will actually block other tasks?
Non-blocking is a beautiful thing XD
The best practices include:
Braking every function down into its minimum functional form.
Keep CallBacks asynchronies, THIS is an excellent post on the use of CallBacks
Avoid stacking operations, (Like nested Loops)
Use setTimeout() to brake up potentially blocking code
And many other things, Node.JS is the gold standard of none blocking so its worth a look.
--
--
setTimeout() is one of the most important functions in no-blocking code
So lets say you make a clock function that looks like this:
function setTime() {
var date=new Date();
time = date.getTime()
document.getElementById('id').innerHTML = time;
}
while(true){setTime();}
Its quite problematic, because this code will happily loop its self until the end of time. No other function will ever be called. You want to brake up the operation so other things can run.
function startTime() {
var date=new Date();
time = date.getTime()
document.getElementById('id').innerHTML = time;
setTimeout(startTime(),1000);
}
'setTimeout();' brakes up the loop and executes it every 1-ish seconds. An infinite loop is a bit of an extreme example. The point is 'setTimeout();' is great at braking up large operation chains into smaller ones, making everything more manageable.

JavaScript and single-threadedness

I always hear that JavaScript is single-threaded; that when JavaScript is executed, it's all run in the same global mosh pit, all in a single thread.
While that may be true, that single execution thread may spawn new threads, asynchronousy reqeiving data back to the main thread, correct? For example, when an XMLHttpRequest is sent, doesn't the browser create a new thread that performs the HTTP transaction, then invoke callbacks back in the main thread when the XMLHttpRequest returns?
What about timers--setTimeout and setInterval? How do those work?
Is this single-threadedness the result of the language? What has stopped JavaScript from having multi-threaded execution before the new Web Workers draft?
XMLHttpRequest, notably, does not block the current thread. However, its specifics within the runtime are not outlined in any specification. It may run in a separate thread or within the current thread, making use of non-blocking I/O.
setTimeout and setInterval set timers that, when run down to zero, add an item for execution, either a line of code of a function/callback, to the execution stack, starting the JavaScript engine if code execution has stopped. In other words, they tell the JavaScript engine to do something after it has finished doing whatever it's doing currently. To see this in action, set multiple setTimeout(s) within one method and call it.
Your JavaScript itself is single-threaded. It may, however, interact with other threads in the browser (which is frequently written with something like C and C++). This is how asynchronous XHR's work. The browser may create a new thread (or it may re-use an existing one with an event loop.)
Timers and intervals will try to make your JavaScript run later, but if you have a while(1){ ; } running don't expect a timer or interval to interrupt it.
(edit: left something out.)
The single-threadedness is largely a result of the ECMA specification. There's really no language constructs for dealing with multiple threads. It wouldn't be impossible to write a JavaScript interpreter with multiple threads and the tools to interact with them, but no one really does that. Certainly no one will do it in a web browser; it would mess everything up. (If you're doing something server-side like Node.js, you'll see that they have eschewed multithreading in the JavaScript proper in favor of a snazzy event loop, and optional multi-processing.)
See this post for a description of how the javascript event queue works, including how it's related to ajax calls.
The browser certainly uses at least one native OS thread/process to handle the actual interface to the OS to retrieve system events (mouse, keyboard, timers, network events, etc...). Whether there is more than one native OS-level thread is dependent upon the browser implementation and isn't really relevant to Javascript behavior. All events from the outside world go through the javascript event queue and no event is processed until a previous javascript thread of execution is completed and the next event is then pulled from the queue given to the javascript engine.
Browser may have other threads to do the job but your Javascript code will still be executed in one thread. Here is how it would work in practice.
In case of time out, browser will create a separate thread to wait for time out to expire or use some other mechanism to implement actual timing logic. Then timeout expires, the message will be placed on main event queue that tells the runtime to execute your handler. and that will happen as soon as message is picked up by main thread.
AJAX request would work similarly. Some browser internal thread may actually connect to server and wait for the response and once response is available place appropriate message on main event queue so main thread executes the handler.
In all cases your code will get executed by main thread. This is not different from most other UI system except that browser hides that logic from you. On other platforms you may need to deal with separate threads and ensure execution of handlers on UI thread.
Putting it more simply than talking in terms of threads, in general (for the browsers I'm aware of) there will only be one block of JavaScript executing at any given time.
When you make an asynchronous Ajax request or call setTimeout or setInterval the browser may manage them in another thread, but the actual JS code in the callbacks will not execute until some point after the currently executing block of code finishes. It just gets queued up.
A simple test to demonstrate this is if you put a piece of relatively long running code after a setTimeout but in the same block:
setTimeout("alert('Timeout!');", 5);
alert("After setTimeout; before loop");
for (var i=0, x=0; i < 2000000; i++) { x += i };
alert("After loop");
If you run the above you'll see the "After setTimeout" alert, then there'll be a pause while the loop runs, then you'll see "After loop", and only after that will you see "Timeout!" - even though clearly much longer than 5ms has passed (especially if you take a while to close the first alert).
An often-quoted reason for the single-thread is that it simplifies the browser's job of rendering the page, because you don't get the situation of lots of different threads of JavaScript all trying to update the DOM at the same time.
Javascript is a language designed to be embedded. It can and has been used in programs that execute javascript concurrently on different operating threads. There isn't much demand for an embedded language to explicitly control the creation of new threads of execution, but it could certainly be done by providing a host object with the required capabilities. The WHATWG actually includes a justification for their decision not to push a standard concurrent execution capability for browsers.

In Node.js, If i am writing a long running function should I be using setTimeout

or something else to queue up the rest of my function? and use callbacks or does node handle that automatically?
I imagine that I would need to start my code and if there are other things that need to occur I should be giving up my functions control to give other events control. Is this the case? Or can i be stingy and node will cut off my function when I have used enough time?
Thanks.
If your long-running function does a lot of I/O just make sure that you do this in a non-blocking way. This is how node.js achieves concurrency even though it only has a single thread: As soon as any task needs to wait for something, another task gets the CPU.
If your long-running function needs uninterrupted CPU time (or the I/O cannot be made asynchronously) , then you probably need to fork out a separate process, because otherwise every one else will have to wait until you are done.
Or can i be stingy and node will cut off my function when I have used enough time?
No. This is totally cooperative multi-tasking. Node cannot preempt you.
You should put your long running function or the code which takes long to execute into separate process because it can, for example, block other incoming requests while this code/function is executing. From node.js website:
But what about multiple-processor concurrency? Aren't threads
necessary to scale programs to multi-core computers? You can start new
processes via child_process.fork() these other processes will be
scheduled in parallel.
I would suggest to watch these articles/presentations in order to get a bigger picture on this topic:
Understanding the node.js event loop
Understanding event loops and writing great code for Node.js
YUI Theater — Tom Hughes-Croucher: “How to Stop Writing Spaghetti Code” (45 min.)

Categories

Resources