Explaining Node Callbacks and Single Threading - javascript

Is node's javascript environment single threaded, or does everything happen at the same time? Or (more likely) do neither of those statements explain what's going on with node.
I'm new to node and trying to understand how it processes callbacks. My googling on the subject hasn't proven fruitful, and it seems like there's multiple audiences using terms like "threads, locking, and single threaded" with difference contexts for each audience, and I don't have enough experience with node to correctly parse what I'm reading.
From what I've read, node's javascript execution environment is, like a browser's, single threaded. That is, despite everything being designed around asynchronous callbacks, everything happens in a deterministic order, and there's never two threads modifying the same variable or running statements at the same time. I've also read this means a node programmer-user doesn't have to worry about locking semantics.
If I'm in browser-land and use one of the popular javascript libraries to setup a non-dom-event-handler callback, something like
console.log("Here");
$.each([1,2,3],function(){
console.log("-- inside the callback --");
});
console.log("There");
My output is consistently
Here
-- inside the callback --
-- inside the callback --
-- inside the callback --
There
However, if I do something like this with a callback in node js (running it as a shell script from the command line)
var function example()
{
var fs = require('fs');
console.log("Here");
fs.readdir('/path/to/folder', function(err_read, files){
console.log('-- inside the callback --');
});
console.log("There");
for(var i=0;i<10000;i++)
{
console.log('.');
}
}
example();
console.log("Reached Top");
I consistently (seemingly — see "not much experience" above) get results like this
Here
There
.
. (repeat 10,000 times)
Reached Top
-- inside the callback --
That is, node finishes executing the function example before calling the callback.
Is this deterministic behavior in node? Or are there times where the callback will be called before the example function finishes? Or will it depend on the implementation in the library using the callback?
I understand the idea behind node is to write event based code — but I'm trying to understand what node is really doing, and what behavior can be relied on and what can't.

Is node's javascript environment single threaded, or does everything happen at the same time?
Node has single threaded program execution model, meaning that only one instruction will be executed at any time within one node process. The execution will continue until the program yields the control. This can happen at the end of the program code, or when callback reaches it's end.
In the first case:
console.log("Here");
$.each([1,2,3],function(){
console.log("-- inside the callback --");
});
console.log("There");
they key is the fact that $.each uses callbacks synchronously, so effectively it calls it's callbacks in deterministic order.
Now in the second case, fs.readdir uses callbacks asynchronously - it puts the callback waiting for the event to be triggered (that is, when reading the directory finishes). That can happen at any time. The calling function, however, doesn't yield the control so example will always finish before any of the callbacks is called.
In one sentence:
All the code of the function/global scope is executed before any callbacks defined in that function/global scope.

In this case, you are calling a function that is doing IO and is asynchronous. That is why you are seeing output the way you are.
Because the rest of your code is executing inline - on the single execution thread, it is completed before the IO events are given a chance to make their way onto the event loop.
I would say expect this behavior, but don't depend on it with absolute certainty as it does depend on the implementation of the library using the callback. An (bad, evil) implementor could be making an synchronous request to see if there is any work to be done and, if not, callback immediately. (Hopefully you'll never see this, but…).

I would like to add a bit to #dc5's answer. On it's website they describe node as
Node.js uses an event-driven, non-blocking I/O model
The event driven part is very significant. It usually clears my doubts when I am having trouble understanding the execution model of the node programs.
So taking your example what is happening is :
Here is printed to the console.
An async call is made to the files.
Node attaches a listener to the async call.
Continues down the execution line and There is printed.
Now it could be that the async call is completed before the current piece of code that is being executed but node js completes the task at hand and then returns to service the async call.
So instead of waiting for anything it keeps on executing in line. Which is why node js execution loop is never idle.

Related

NodeJS setInterval() running parallel to loop using async [duplicate]

I am thinking about it and this is what I came up with:
Let's see this code below:
console.clear();
console.log("a");
setTimeout(function(){console.log("b");},1000);
console.log("c");
setTimeout(function(){console.log("d");},0);
A request comes in, and JS engine starts executing the code above step by step. The first two calls are sync calls. But when it comes to setTimeout method, it becomes an async execution. But JS immediately returns from it and continue executing, which is called Non-Blocking or Async. And it continues working on other etc.
The results of this execution is the following:
a c d b
So basically the second setTimeout got finished first and its callback function gets executed earlier than the first one and that makes sense.
We are talking about single-threaded application here. JS Engine keeps executing this and unless it finishes the first request, it won't go to second one. But the good thing is that it won't wait for blocking operations like setTimeout to resolve so it will be faster because it accepts the new incoming requests.
But my questions arise around the following items:
#1: If we are talking about a single-threaded application, then what mechanism processes setTimeouts while the JS engine accepts more requests and executes them? How does the single thread continue working on other requests? What works on setTimeout while other requests keep coming in and get executed.
#2: If these setTimeout functions get executed behind the scenes while more requests are coming in and being executed, what carries out the async executions behind the scenes? What is this thing that we talk about called the EventLoop?
#3: But shouldn't the whole method be put in the EventLoop so that the whole thing gets executed and the callback method gets called? This is what I understand when talking about callback functions:
function downloadFile(filePath, callback)
{
blah.downloadFile(filePath);
callback();
}
But in this case, how does the JS Engine know if it is an async function so that it can put the callback in the EventLoop? Perhaps something like the async keyword in C# or some sort of an attribute which indicates the method JS Engine will take on is an async method and should be treated accordingly.
#4: But an article says quite contrary to what I was guessing on how things might be working:
The Event Loop is a queue of callback functions. When an async
function executes, the callback function is pushed into the queue. The
JavaScript engine doesn't start processing the event loop until the
code after an async function has executed.
#5: And there is this image here which might be helpful but the first explanation in the image is saying exactly the same thing mentioned in question number 4:
So my question here is to get some clarifications about the items listed above?
1: If we are talking about a single-threaded application, then what processes setTimeouts while JS engine accepts more requests and executes them? Isn't that single thread will continue working on other requests? Then who is going to keep working on setTimeout while other requests keep coming and get executed.
There's only 1 thread in the node process that will actually execute your program's JavaScript. However, within node itself, there are actually several threads handling operation of the event loop mechanism, and this includes a pool of IO threads and a handful of others. The key is the number of these threads does not correspond to the number of concurrent connections being handled like they would in a thread-per-connection concurrency model.
Now about "executing setTimeouts", when you invoke setTimeout, all node does is basically update a data structure of functions to be executed at a time in the future. It basically has a bunch of queues of stuff that needs doing and every "tick" of the event loop it selects one, removes it from the queue, and runs it.
A key thing to understand is that node relies on the OS for most of the heavy lifting. So incoming network requests are actually tracked by the OS itself and when node is ready to handle one it just uses a system call to ask the OS for a network request with data ready to be processed. So much of the IO "work" node does is either "Hey OS, got a network connection with data ready to read?" or "Hey OS, any of my outstanding filesystem calls have data ready?". Based upon its internal algorithm and event loop engine design, node will select one "tick" of JavaScript to execute, run it, then repeat the process all over again. That's what is meant by the event loop. Node is basically at all times determining "what's the next little bit of JavaScript I should run?", then running it. This factors in which IO the OS has completed, and things that have been queued up in JavaScript via calls to setTimeout or process.nextTick.
2: If these setTimeout will get executed behind the scenes while more requests are coming and in and being executed, the thing carry out the async executions behind the scenes is that the one we are talking about EventLoop?
No JavaScript gets executed behind the scenes. All the JavaScript in your program runs front and center, one at a time. What happens behind the scenes is the OS handles IO and node waits for that to be ready and node manages its queue of javascript waiting to execute.
3: How can JS Engine know if it is an async function so that it can put it in the EventLoop?
There is a fixed set of functions in node core that are async because they make system calls and node knows which these are because they have to call the OS or C++. Basically all network and filesystem IO as well as child process interactions will be asynchronous and the ONLY way JavaScript can get node to run something asynchronously is by invoking one of the async functions provided by the node core library. Even if you are using an npm package that defines it's own API, in order to yield the event loop, eventually that npm package's code will call one of node core's async functions and that's when node knows the tick is complete and it can start the event loop algorithm again.
4 The Event Loop is a queue of callback functions. When an async function executes, the callback function is pushed into the queue. The JavaScript engine doesn't start processing the event loop until the code after an async function has executed.
Yes, this is true, but it's misleading. The key thing is the normal pattern is:
//Let's say this code is running in tick 1
fs.readFile("/home/barney/colors.txt", function (error, data) {
//The code inside this callback function will absolutely NOT run in tick 1
//It will run in some tick >= 2
});
//This code will absolutely also run in tick 1
//HOWEVER, typically there's not much else to do here,
//so at some point soon after queueing up some async IO, this tick
//will have nothing useful to do so it will just end because the IO result
//is necessary before anything useful can be done
So yes, you could totally block the event loop by just counting Fibonacci numbers synchronously all in memory all in the same tick, and yes that would totally freeze up your program. It's cooperative concurrency. Every tick of JavaScript must yield the event loop within some reasonable amount of time or the overall architecture fails.
Don't think the host process to be single-threaded, they are not. What is single-threaded is the portion of the host process that execute your javascript code.
Except for background workers, but these complicate the scenario...
So, all your js code run in the same thread, and there's no possibility that you get two different portions of your js code to run concurrently (so, you get not concurrency nigthmare to manage).
The js code that is executing is the last code that the host process picked up from the event loop.
In your code you can basically do two things: run synchronous instructions, and schedule functions to be executed in future, when some events happens.
Here is my mental representation (beware: it's just that, I don't know the browser implementation details!) of your example code:
console.clear(); //exec sync
console.log("a"); //exec sync
setTimeout( //schedule inAWhile to be executed at now +1 s
function inAWhile(){
console.log("b");
},1000);
console.log("c"); //exec sync
setTimeout(
function justNow(){ //schedule justNow to be executed just now
console.log("d");
},0);
While your code is running, another thread in the host process keep track of all system events that are occurring (clicks on UI, files read, networks packets received etc.)
When your code completes, it is removed from the event loop, and the host process return to checking it, to see if there are more code to run. The event loop contains two event handler more: one to be executed now (the justNow function), and another within a second (the inAWhile function).
The host process now try to match all events happened to see if there handlers registered for them.
It found that the event that justNow is waiting for has happened, so it start to run its code. When justNow function exit, it check the event loop another time, searhcing for handlers on events. Supposing that 1 s has passed, it run the inAWhile function, and so on....
The Event Loop has one simple job - to monitor the Call Stack, the Callback Queue and Micro task queue. If the Call Stack is empty, the Event Loop will take the first event from the micro task queue then from the callback queue and will push it to the Call Stack, which effectively runs it. Such an iteration is called a tick in the Event Loop.
As most developers know, that Javascript is single threaded, means two statements in javascript can not be executed in parallel which is correct. Execution happens line by line, which means each javascript statements are synchronous and blocking. But there is a way to run your code asynchronously, if you use setTimeout() function, a Web API given by the browser, which makes sure that your code executes after specified time (in millisecond).
Example:
console.log("Start");
setTimeout(function cbT(){
console.log("Set time out");
},5000);
fetch("http://developerstips.com/").then(function cbF(){
console.log("Call back from developerstips");
});
// Millions of line code
// for example it will take 10000 millisecond to execute
console.log("End");
setTimeout takes a callback function as first parameter, and time in millisecond as second parameter.
After the execution of above statement in browser console it will print
Start
End
Call back from developerstips
Set time out
Note: Your asynchronous code runs after all the synchronous code is done executing.
Understand How the code execution line by line
JS engine execute the 1st line and will print "Start" in console
In the 2nd line it sees the setTimeout function named cbT, and JS engine pushes the cbT function to callBack queue.
After this the pointer will directly jump to line no.7 and there it will see promise and JS engine push the cbF function to microtask queue.
Then it will execute Millions of line code and end it will print "End"
After the main thread end of execution the event loop will first check the micro task queue and then call back queue. In our case it takes cbF function from the micro task queue and pushes it into the call stack then it will pick cbT funcion from the call back queue and push into the call stack.
JavaScript is high-level, single-threaded language, interpreted language. This means that it needs an interpreter which converts the JS code to a machine code. interpreter means engine. V8 engines for chrome and webkit for safari. Every engine contains memory, call stack, event loop, timer, web API, events, etc.
Event loop: microtasks and macrotasks
The event loop concept is very simple. There’s an endless loop, where the JavaScript engine waits for tasks, executes them and then sleeps, waiting for more tasks
Tasks are set – the engine handles them – then waits for more tasks (while sleeping and consuming close to zero CPU). It may happen that a task comes while the engine is busy, then it’s enqueued. The tasks form a queue, so-called “macrotask queue”
Microtasks come solely from our code. They are usually created by promises: an execution of .then/catch/finally handler becomes a microtask. Microtasks are used “under the cover” of await as well, as it’s another form of promise handling. Immediately after every macrotask, the engine executes all tasks from microtask queue, prior to running any other macrotasks or rendering or anything else.

Mechanics of Javascript Asynchronous execution

Can someone please explain the mechanics behind the asynchronous behavior of javascript, specifically when it comes to a .subscribe(), or provide a link that would help me understand what's really going on under the hood?
Everything appears to be single-threaded, then I run into situations that make it either multi-threaded, or seem multi-threaded, because I still can't shake this feeling I'm being fooled.
Everything seems to run perfectly synchronously, one instruction at a time, until I call .subscribe(). Then, suddenly, something happens - it returns immediately, and somehow the next outside function runs even though the subscription code has not yet. This feels asynchronous to me, how does the browser keep things straight - if not threads, can someone please explain the mechanics in play that is allowing this to occur?
I also notice something happens with debug stepping, I can't step in to the callback, I can only put a breakpoint within it. In what order should I really expect things to occur?
Your Javascript itself runs single threaded unless you're talking about WebWorkers (in the browser) or WorkerThreads (in nodejs).
But, many functions you call or operations you initiate in Javascript can run separate from that single threaded nature of your Javascript. For example, networking is all non-blocking and asynchronous. That means that when you make a call to fetch() in the browser of http.get() in nodejs, you are initiating an http request and then the mechanics of making that request and receiving the response are done independent of your Javascript. Here are some example steps:
You make a call to fetch() in the browser to make an http request to some other host.
The code behind fetch, sends out the http request and then immediately returns.
Your Javascript continues to execute (whatever you had on the lines of code immediately after the fetch() call will execute.
Sometime later, the response from the fetch() call arrives back on the network interface. Some code (internal to the Javascript environment) and independent from your single thread of Javascript sees that there's incoming data to read from the http response. When that code has gathered the http response, it inserts an event into the Javascript event queue.
When the single thread of Javascript has finished executing whatever event it was executing, it checks the event queue to see if there's anything else to do. At some point, it will find the event that signified the completion of the previous fetch() call and there will be a callback function associated with that event. It will call that callback which will resolve the promise associated with the fetch() call which will cause your own Javascript in the .then() handler for that promise to run and your Javascript will be presented the http response you were waiting for.
Because these non-blocking, asynchronous operations (such as networking, timers, disk I/O, etc...) all work in this same manner and can proceed independent of your Javascript execution, they can run in parallel with your Javascript execution and many of them can be in flight at the same time while your Javascript is doing other things. This may sometimes give the appearance of multiple threads and there likely are some native code threads participating in some of this, though there is still only one thread of your Javascript (as long as we're not talking about WebWorkers or WorkerThreads).
The completion of these asynchronous operations is then synchronized with the single thread of Javascript via the event queue. When they finish, they place an event in the event queue and when the single thread of Javascript finishes what it was last doing, it grabs the next event from the event queue and processes it (calling a Javascript callback).
I also notice something happens with debug stepping, I can't step in to the callback, I can only put a breakpoint within it. In what order should I really expect things to occur?
This is because the callback isn't called synchronously. When you step over the function that you passed a callback to, it executes that function and it returns. Your callback is not yet called. So, the debugger did exactly what you asked it to do. It executed that function and that function returned. The callback has not yet executed. Your are correct that if you want to see the asynchronous callback execute, you need to place a breakpoint in it so when it does get called some time in the future, the debugger will then stop there and let you step further. This is a complication of debugging asynchronous things, but you get used to it after awhile.
Everything appears to be single-threaded, then I run into situations that make it either multi-threaded, or seem multi-threaded, because I still can't shake this feeling I'm being fooled.
This is because there can be many asynchronous operations in flight at the same time as they aren't running your Javascript while they are doing their thing. So, there's still only one thread of Javascript, even though there are multiple asynchronous operations in flight. Here's a code snippet:
console.log("starting...");
fetch(host1).then(result1 => { console.log(result1)});
fetch(host2).then(result2 => { console.log(result2)});
fetch(host3).then(result3 => { console.log(result3)});
fetch(host4).then(result4 => { console.log(result4)});
console.log("everything running...");
If you run this, you will see something like this in the console:
starting...
everything running...
result1
result2
result3
result4
Actually, the 4 results may be in any order relative to one another, but they will be after the first two lines of debug output. In this case, you started four separate fetch() operations and they are running independent of your Javascript. In fact, your Javascript is free to do other things while they run. When they finish and when your Javascript isn't doing anything else, your four .then() handlers will each get called at the appropriate time.
When the networking behind each fetch() operation finishes, it will insert an event in the Javascript event queue and if your Javascript isn't doing anything at the time, the insertion of that event will wake it up and cause it to process that completion event. If it was doing something at the time, then when it finishes that particular piece of Javascript and control returns back to the system, it will see there's an event waiting in the event queue and will process it, calling your Javascript callback associated with that event.
Search for articles or videos on the “JavaScript Event Loop”. You’ll find plenty of them. Probably go through a few of them and it will start to make sense (particularly the single-threaded aspect of it)
I found this one with a quick search, and it does a good, very high-level walkthrough.
The JavaScript Event Loop explained
The main impact of JavaScript single threading is that your code is never pre-emptively interrupted to execute code on another thread.
It may seem that some code is being skipped, but close attention to the syntax will show that what is being “skipped” is just a callback function or closure to be called at a time in the future (when it will be placed into the event queue and processed single-threadedly, itself).
Careful, it can be tricky to follow nested callbacks.
Probably also look into Promises and async /await (‘just syntactic convenience on the event loop, but can really help code readability)

How browser api handles multiple asynchronous functions like setTimeOuts? [duplicate]

I am thinking about it and this is what I came up with:
Let's see this code below:
console.clear();
console.log("a");
setTimeout(function(){console.log("b");},1000);
console.log("c");
setTimeout(function(){console.log("d");},0);
A request comes in, and JS engine starts executing the code above step by step. The first two calls are sync calls. But when it comes to setTimeout method, it becomes an async execution. But JS immediately returns from it and continue executing, which is called Non-Blocking or Async. And it continues working on other etc.
The results of this execution is the following:
a c d b
So basically the second setTimeout got finished first and its callback function gets executed earlier than the first one and that makes sense.
We are talking about single-threaded application here. JS Engine keeps executing this and unless it finishes the first request, it won't go to second one. But the good thing is that it won't wait for blocking operations like setTimeout to resolve so it will be faster because it accepts the new incoming requests.
But my questions arise around the following items:
#1: If we are talking about a single-threaded application, then what mechanism processes setTimeouts while the JS engine accepts more requests and executes them? How does the single thread continue working on other requests? What works on setTimeout while other requests keep coming in and get executed.
#2: If these setTimeout functions get executed behind the scenes while more requests are coming in and being executed, what carries out the async executions behind the scenes? What is this thing that we talk about called the EventLoop?
#3: But shouldn't the whole method be put in the EventLoop so that the whole thing gets executed and the callback method gets called? This is what I understand when talking about callback functions:
function downloadFile(filePath, callback)
{
blah.downloadFile(filePath);
callback();
}
But in this case, how does the JS Engine know if it is an async function so that it can put the callback in the EventLoop? Perhaps something like the async keyword in C# or some sort of an attribute which indicates the method JS Engine will take on is an async method and should be treated accordingly.
#4: But an article says quite contrary to what I was guessing on how things might be working:
The Event Loop is a queue of callback functions. When an async
function executes, the callback function is pushed into the queue. The
JavaScript engine doesn't start processing the event loop until the
code after an async function has executed.
#5: And there is this image here which might be helpful but the first explanation in the image is saying exactly the same thing mentioned in question number 4:
So my question here is to get some clarifications about the items listed above?
1: If we are talking about a single-threaded application, then what processes setTimeouts while JS engine accepts more requests and executes them? Isn't that single thread will continue working on other requests? Then who is going to keep working on setTimeout while other requests keep coming and get executed.
There's only 1 thread in the node process that will actually execute your program's JavaScript. However, within node itself, there are actually several threads handling operation of the event loop mechanism, and this includes a pool of IO threads and a handful of others. The key is the number of these threads does not correspond to the number of concurrent connections being handled like they would in a thread-per-connection concurrency model.
Now about "executing setTimeouts", when you invoke setTimeout, all node does is basically update a data structure of functions to be executed at a time in the future. It basically has a bunch of queues of stuff that needs doing and every "tick" of the event loop it selects one, removes it from the queue, and runs it.
A key thing to understand is that node relies on the OS for most of the heavy lifting. So incoming network requests are actually tracked by the OS itself and when node is ready to handle one it just uses a system call to ask the OS for a network request with data ready to be processed. So much of the IO "work" node does is either "Hey OS, got a network connection with data ready to read?" or "Hey OS, any of my outstanding filesystem calls have data ready?". Based upon its internal algorithm and event loop engine design, node will select one "tick" of JavaScript to execute, run it, then repeat the process all over again. That's what is meant by the event loop. Node is basically at all times determining "what's the next little bit of JavaScript I should run?", then running it. This factors in which IO the OS has completed, and things that have been queued up in JavaScript via calls to setTimeout or process.nextTick.
2: If these setTimeout will get executed behind the scenes while more requests are coming and in and being executed, the thing carry out the async executions behind the scenes is that the one we are talking about EventLoop?
No JavaScript gets executed behind the scenes. All the JavaScript in your program runs front and center, one at a time. What happens behind the scenes is the OS handles IO and node waits for that to be ready and node manages its queue of javascript waiting to execute.
3: How can JS Engine know if it is an async function so that it can put it in the EventLoop?
There is a fixed set of functions in node core that are async because they make system calls and node knows which these are because they have to call the OS or C++. Basically all network and filesystem IO as well as child process interactions will be asynchronous and the ONLY way JavaScript can get node to run something asynchronously is by invoking one of the async functions provided by the node core library. Even if you are using an npm package that defines it's own API, in order to yield the event loop, eventually that npm package's code will call one of node core's async functions and that's when node knows the tick is complete and it can start the event loop algorithm again.
4 The Event Loop is a queue of callback functions. When an async function executes, the callback function is pushed into the queue. The JavaScript engine doesn't start processing the event loop until the code after an async function has executed.
Yes, this is true, but it's misleading. The key thing is the normal pattern is:
//Let's say this code is running in tick 1
fs.readFile("/home/barney/colors.txt", function (error, data) {
//The code inside this callback function will absolutely NOT run in tick 1
//It will run in some tick >= 2
});
//This code will absolutely also run in tick 1
//HOWEVER, typically there's not much else to do here,
//so at some point soon after queueing up some async IO, this tick
//will have nothing useful to do so it will just end because the IO result
//is necessary before anything useful can be done
So yes, you could totally block the event loop by just counting Fibonacci numbers synchronously all in memory all in the same tick, and yes that would totally freeze up your program. It's cooperative concurrency. Every tick of JavaScript must yield the event loop within some reasonable amount of time or the overall architecture fails.
Don't think the host process to be single-threaded, they are not. What is single-threaded is the portion of the host process that execute your javascript code.
Except for background workers, but these complicate the scenario...
So, all your js code run in the same thread, and there's no possibility that you get two different portions of your js code to run concurrently (so, you get not concurrency nigthmare to manage).
The js code that is executing is the last code that the host process picked up from the event loop.
In your code you can basically do two things: run synchronous instructions, and schedule functions to be executed in future, when some events happens.
Here is my mental representation (beware: it's just that, I don't know the browser implementation details!) of your example code:
console.clear(); //exec sync
console.log("a"); //exec sync
setTimeout( //schedule inAWhile to be executed at now +1 s
function inAWhile(){
console.log("b");
},1000);
console.log("c"); //exec sync
setTimeout(
function justNow(){ //schedule justNow to be executed just now
console.log("d");
},0);
While your code is running, another thread in the host process keep track of all system events that are occurring (clicks on UI, files read, networks packets received etc.)
When your code completes, it is removed from the event loop, and the host process return to checking it, to see if there are more code to run. The event loop contains two event handler more: one to be executed now (the justNow function), and another within a second (the inAWhile function).
The host process now try to match all events happened to see if there handlers registered for them.
It found that the event that justNow is waiting for has happened, so it start to run its code. When justNow function exit, it check the event loop another time, searhcing for handlers on events. Supposing that 1 s has passed, it run the inAWhile function, and so on....
The Event Loop has one simple job - to monitor the Call Stack, the Callback Queue and Micro task queue. If the Call Stack is empty, the Event Loop will take the first event from the micro task queue then from the callback queue and will push it to the Call Stack, which effectively runs it. Such an iteration is called a tick in the Event Loop.
As most developers know, that Javascript is single threaded, means two statements in javascript can not be executed in parallel which is correct. Execution happens line by line, which means each javascript statements are synchronous and blocking. But there is a way to run your code asynchronously, if you use setTimeout() function, a Web API given by the browser, which makes sure that your code executes after specified time (in millisecond).
Example:
console.log("Start");
setTimeout(function cbT(){
console.log("Set time out");
},5000);
fetch("http://developerstips.com/").then(function cbF(){
console.log("Call back from developerstips");
});
// Millions of line code
// for example it will take 10000 millisecond to execute
console.log("End");
setTimeout takes a callback function as first parameter, and time in millisecond as second parameter.
After the execution of above statement in browser console it will print
Start
End
Call back from developerstips
Set time out
Note: Your asynchronous code runs after all the synchronous code is done executing.
Understand How the code execution line by line
JS engine execute the 1st line and will print "Start" in console
In the 2nd line it sees the setTimeout function named cbT, and JS engine pushes the cbT function to callBack queue.
After this the pointer will directly jump to line no.7 and there it will see promise and JS engine push the cbF function to microtask queue.
Then it will execute Millions of line code and end it will print "End"
After the main thread end of execution the event loop will first check the micro task queue and then call back queue. In our case it takes cbF function from the micro task queue and pushes it into the call stack then it will pick cbT funcion from the call back queue and push into the call stack.
JavaScript is high-level, single-threaded language, interpreted language. This means that it needs an interpreter which converts the JS code to a machine code. interpreter means engine. V8 engines for chrome and webkit for safari. Every engine contains memory, call stack, event loop, timer, web API, events, etc.
Event loop: microtasks and macrotasks
The event loop concept is very simple. There’s an endless loop, where the JavaScript engine waits for tasks, executes them and then sleeps, waiting for more tasks
Tasks are set – the engine handles them – then waits for more tasks (while sleeping and consuming close to zero CPU). It may happen that a task comes while the engine is busy, then it’s enqueued. The tasks form a queue, so-called “macrotask queue”
Microtasks come solely from our code. They are usually created by promises: an execution of .then/catch/finally handler becomes a microtask. Microtasks are used “under the cover” of await as well, as it’s another form of promise handling. Immediately after every macrotask, the engine executes all tasks from microtask queue, prior to running any other macrotasks or rendering or anything else.

I know that callback function runs asynchronously, but why?

Which part of syntax provides the information that this function should run in other thread and be non-blocking?
Let's consider simple asynchronous I/O in node.js
var fs = require('fs');
var path = process.argv[2];
fs.readFile(path, 'utf8', function(err,data) {
var lines = data.split('\n');
console.log(lines.length-1);
});
What exactly makes the trick that it happens in background? Could anyone explain it precisely or paste a link to some good resource? Everywhere I looked there is plenty of info about what callback is, but nobody explains why it actually works like that.
This is not the specific question about node.js, it's about general concept of callback in each programming language.
EDIT:
Probably the example I provided is not best here. So let's do not consider this node.js code snippet. I'm asking generally - what makes the trick that program keeps executing when encounter callback function. What is in syntax
that makes callback concept a non-blocking one?
Thanks in advance!
There is nothing in the syntax that tells you your callback is executed asynchronously. Callbacks can be asynchronous, such as:
setTimeout(function(){
console.log("this is async");
}, 100);
or it can be synchronous, such as:
an_array.forEach(function(x){
console.log("this is sync");
});
So, how can you know if a function will invoke the callback synchronously or asynchronously? The only reliable way is to read the documentation.
You can also write a test to find out if documentation is not available:
var t = "this is async";
some_function(function(){
t = "this is sync";
});
console.log(t);
How asynchronous code work
Javascript, per se, doesn't have any feature to make functions asynchronous. If you want to write an asynchronous function you have two options:
Use another asynchronous function such as setTimeout or web workers to execute your logic.
Write it in C.
As for how the C coded functions (such as setTimeout) implement asynchronous execution? It all has to do with the event loop (or mostly).
The Event Loop
Inside the web browser there is this piece of code that is used for networking. Originally, the networking code could only download one thing: the HTML page itself. When Mosaic invented the <img> tag the networking code evolved to download multiple resources. Then Netscape implemented progressive rendering of images, they had to make the networking code asynchronous so that they can draw the page before all images are loaded and update each image progressively and individually. This is the origin of the event loop.
In the heart of the browser there is an event loop that evolved from asynchronous networking code. So it's not surprising that it uses an I/O primitive as its core: select() (or something similar such as poll, epoll etc. depending on OS).
The select() function in C allows you to wait for multiple I/O operations in a single thread without needing to spawn additional threads. select() looks something like:
select (max, readlist, writelist, errlist, timeout)
To have it wait for an I/O (from a socket or disk) you'd add the file descriptor to the readlist and it will return when there is data available on any of your I/O channels. Once it returns you can continue processing the data.
The javascript interpreter saves your callback and then calls the select() function. When select() returns the interpreter figures out which callback is associated with which I/O channel and then calls it.
Conveniently, select() also allows you to specify a timeout value. By carefully managing the timeout passed to select() you can cause callbacks to be called at some time in the future. This is how setTimeout and setInterval are implemented. The interpreter keeps a list of all timeouts and calculates what it needs to pass as timeout to select(). Then when select() returns in addition to finding out if there are any callbacks that needs to be called due to an I/O operation the interpreter also checks for any expired timeouts that needs to be called.
So select() alone covers almost all the functionality necessary to implement asynchronous functions. But modern browsers also have web workers. In the case of web workers the browser spawns threads to execute javascript code asynchronously. To communicate back to the main thread the workers must still interact with the event loop (the select() function).
Node.js also spawns threads when dealing with file/disk I/O. When the I/O operation completes it communicates back with the main event loop to cause the appropriate callbacks to execute.
Hopefully this answers your question. I've always wanted to write this answer but was to busy to do so previously. If you want to know more about non-blocking I/O programming in C I suggest you take a read this: http://www.gnu.org/software/libc/manual/html_node/Waiting-for-I_002fO.html
For more information see also:
Is nodejs representing Reactor or Proactor design pattern?
Performance of NodeJS with large amount of callbacks
First of all, if something is not Async, it means it's blocking. So the javascript runner stops on that line until that function is over (that's what a readFileSync would do).
As we all know, fs is a IO library, so that kind of things take time (tell the hardware to read some files is not something done right away), so it makes a lot of sense that anything that does not require only the CPU, it's async, because it takes time, and does not need to freeze the rest of the code for waiting another piece of hardware (while the CPU is idle).
I hope this solves your doubts.
A callback is not necessarily asynchronous. Execution depends entirely on how fs.readFile decides to treat the function parameter.
In JavaScript, you can execute a function asynchronously using for example setTimeout.
Discussion and resources:
How does node.js implement non-blocking I/O?
Concurrency model and Event Loop
Wikipedia:
There are two types of callbacks, differing in how they control data flow at runtime: blocking callbacks (also known as synchronous callbacks or just callbacks) and deferred callbacks (also known as asynchronous callbacks).

Node JS Code Execution

I'm getting slightly annoyed now after reading many different articles on how to write proper code in node js.
I would just like some clarification on whether I'm correct or not with each of these statements:
Code is executed synchronously
A for loop or while loop etc. will be executed asynchronously
By doing the code below isn't proper asynchronous:
L
function doSomething(callback) {
// doing some code
// much much code
callback();
}
The reason that people are saying that this won't work properly is that code is executed asynchronously, so the callback won't be fired off at the end of the code it will all be executed at once.
So for example if you were filling some object by doing some stuff and you wanted to post that full object back through the callback it doesn't work because it will all be executed at the same time.
No, for and while loops are synchronous in Node.js.
Whether your doSomething function example will work or not all depends on whether you're making any calls to asynchronous functions before calling callback(). If you're making asynchronous calls, then you'd need to postpone calling callback() until those asynchronous calls have completed. If you're only making synchronous calls, then you don't need to use a callback and your function can synchronously return the result.
Code in node.js isn't standardly asynchronous. Calls that you place one after another will still be executed one after another. A for or while loop will still block the main code in this way.
However, you can pass around functions which will be executed later, through process.nextTick(callback). This method will add a function to the event queue, which node.js takes care of internally. This will only be executed after any other blocking code has been extecuted. An example:
function doWork(callback) {
//Do some work here, for instance a for loop that might take a while
callback(workReslt);
}
function workDone(workResult) {
//Process the result here
}
function startWork() {
process.nextTick(doWork(workDone));
}
This will execute doWork() when the server passes through the event loop again. This way, the startWork() function will not block the main code.
However, most modules in node.js already implement this. Database and file access is usually non-blocking, it will simply fire a callback when it is done with it's work.
If you really have to perform heavy async calculations (or perhaps some IO that doesn't already have a module for it), you can look into the async module, which has a nice tutorial here: http://justinklemm.com/node-js-async-tutorial/
As for coding conventions, think of it this way. If you know the action is going to take time, you should make it asynchronous (either through the method I mentioned first or the async module). Actions like these are usually related to I/O, which usually also means there is already an asynchronous module for it. However, most cases I have come across up to now, have never required this.
Sources:
http://howtonode.org/understanding-process-next-tick

Categories

Resources