Why Node.js setImmediate executes after I/O callbacks? - javascript

As new member, I'm unable to comment on topics, that's why I had to create a new topic. But in this way, I can clarify the problem, so hopefully you guys can help me.
I have read quite a lot about Node.js Event Loop. And I have shaped my understanding of it based on following materials:
Node.js Event Loop
What the heck is the event loop anyway?
Why setImmediate() execute before fs.readFile() in Nodejs Event Loop's works?
(Please feel free to suggest other materials which are informative and accurate)
Especially the third link, has given me a better understanding. But keeping that in mind, I'm unable to understand Event Loop behavior for the following code:
var fs = require('fs');
var pos = 0;
fs.stat(__filename, function() {
console.log(++pos + " FIRST STAT");
});
fs.stat(__filename, function() {
console.log(++pos + " LAST STAT");
});
setImmediate(function() {
console.log(++pos + " IMMEDIATE")
})
console.log(++pos + "LOGGER");
Surprisingly, for me output is as follow:
LOGGER
FIRST STAT
LAST STAT
IMMEDIATE
screenshot of my terminal, showing output as well as node version
screenshot of output from online code compiler rextester.com
Keeping the Event Loop Diagram in mind, I guess flow should be as follow:
Interpretor firstly starts two stat operations.
Interpreter en-queues setImmedate callback (event) in the setImmedate queue
Call stack logs the logger
All event queues before I/O Poll phase are empty, so Event Loop(EL) moves on
In I/O Polling phase, EL collects the events and enqueues both the fs.stat callbacks in the "run completed I/O handlers" phase
EL checks the Check phase, and run the setImmediate callback
This round of EL ends, and second round starts
In "run completed I/O handlers", EL runs both callbacks (order of them can is onn-determinstic)
Question 1: Which part of my analysis/prediction is wrong?
Question 2: At which point, does Event Loop start working? Does it start from the beginning of the app (i.e. stage 1)? or does it start once the whole code is read by interpreter, all sync tasks are done within Call Stack, and Call Stack needs more task, i.e. between stage 3-4?
Thanks in advance,

setImmediate = execute without wait any I/O
In https://nodejs.org/docs/v8.9.3/api/timers.html#timers_setimmediate_callback_args says:
Schedules the "immediate" execution of the callback after I/O events' callbacks. Returns an Immediate for use with clearImmed
Steps:
callback for First stat is queued in I/O queue
callback for Last stat is queued in I/O queue
callback for immediate is queued in Immediates queue
LOGGER
If I/O operations (in 1 and 2) are finished the callbacks in 1 and/or 2 are marked as ready to execute
Execute the ready callbacks one by one (first timmer, then I/O, finally immediates). In your case:
First stat
Last stat
LOGGER
In the case that I/O does'nt ends at 5. then LOGGER were execute before FIRST STAT and LAST STAT.
See also: https://jsblog.insiderattack.net/timers-immediates-and-process-nexttick-nodejs-event-loop-part-2-2c53fd511bb3#f3dd

Related

DOMContentLoaded event and task queue

I heard that there are three queues which have tasks in Event Loop Processing Model.
MacroTaskQueue : this queue have callback functions of setTimeout, setInterval ..etc
MicroTaskQueue : this queue have callback functions of promise, mutationOberver ..etc
AnimationFrameQueue : this queue have callback functions of requestAnimationFrame.
So, what i'm wondering is that
Who fires DOMContentLoaded event ?
Where the callback function of DOMContentLoaded is queued ? MacroTaskQueue or MicroTaskQueue?
finally,
var a = 10;
console.log(a);
setTimeout(function b() { console.log('im b'); }, 1000);
in this code,
var a = 10;
console.log(a);
is this code also queued in MacroTaskQueue or MicroTaskQueue ?
or only the b is queued in MacroTaskQueue after (min) 1000ms ?
Im in black hole. Help me please :D
What you call the "MacroTaskQueue" is actually made of several task-queues, where tasks are being queued. (Note that the specs only use multiple task-sources, there could actually be a single task-queue). At the beginning of the event-loop processing, the browser will choose from which task queue it will pick the next "main" task to execute. It's important to understand that these tasks may very well not imply any JavaScript execution at all, JS is only a small part of what a browser does.
The microtask-queue will be visited and emptied several times during a single event-loop iteration. For instance every time that the JS call stack has been emptied (i.e after almost every JS callback execution) and if it wasn't enough there are fixed "Perform a microtask checkpoint" points in the event-loop processing model.
While similar to a queue, the animation frame callbacks are actually stored in an ordered map, not in a queue per se, this allows to "queue" new callbacks from one of these callbacks without it being dequeued immediately after. More importantly, a lot of other callbacks are also executed at the same time, e.g the scroll events, resize events, Web animation steps + events, ResizeObserver callbacks, etc. But this "update the rendering" step happens only once in a while, generally at the monitor refresh rate.
But, that's not saying much about DOMContentLoaded.
Who fires DOMContentLoaded event ?
This event is fired as part of the Document parsing steps, in the "the end" section. The browser has to first queue a task on the DOM manipulation task-source. This task will then eventually get selected by the browser as part of the first step of the event-loop. And once this task's steps will be executed, the event will be fired and dispatched on the Document. That's only as part of this dispatch an event algorithm that the browser will invoke and inner-invoke until it calls our listener's callback.
Note that this Document parsing step is in itself quite interesting as a task since this is the most obvious place where you will have multiple microtask-checkpoints interleaved inside the "main" task (at each <script> for instance).
Where the callback function of DOMContentLoaded is queued ?
The callback function is not queued, it is conceptually stored in the EventTarget's event listener list. In the facts, it's stored in memory, since here the EventTarget is a DOM object (Document), it's probably attached to this DOM object, though this is an implementation detail on which the specs have little to say as this is transparent to us web-devs.
MacroTaskQueue or MicroTaskQueue?
As I hope you now understand better, neither. Task queues and the microtask-queue only store tasks and microtasks, not callbacks. The callbacks are stored elsewhere, depending on what kind of callbacks they are (e.g timers and events are stored in different "conceptual" places), and some task or microtask's steps will then call them.
is this code also queued in MacroTaskQueue or MicroTaskQueue?
That depends where this script has been parsed from. If it's inline in a classic <script> tag, then that would be the special parsing task we already talked about. If it's from a <script src="url.js">, then it will be part of a task queued from fetch a classic script, but it can also be part of a microtask, e.g if after an await in a module script, or you can even force it to be if you want:
queueMicrotask(() => {
console.log("in microtask");
eval(document.querySelector("[type=myscript]").textContent);
console.log("still in microtask");
});
console.log("in parsing task");
<script type="myscript">
var a = 10;
console.log(a);
setTimeout(function b() { console.log('im b'); }, 1000);
</script>
And it is even theoretically possible by specs that a microtask becomes a "macro-"task, though no browser does implements this anymore apparently.
All this to say, while I personally find all this stuff fascinating, as a web-dev you shouldn't block yourself on it.

nodejs: setTimeout and IO(POLL) is showing inconsistency

I have this piece of code:
// * Run this snippet of code multiple times
const fs = require('fs');
setTimeout(() => {
console.log('timer');
});
fs.readFile('', 'utf-8',(err, data) => {
console.log('io');
});
setImmediate(() => {
console.log('check');
});
On running the above mentioned code for multiple times. I'm getting different outputs.
Result 1
Somethimes I'm getting
timer
io
check
Result 2
and other times. I'm getting
io
check
timer
Can anyone please clarify what is going on here? I was expecting Result 1.
That has to do with how the event loop picks up things to do when more than one thing is available. Here you can find a good explanation.
If you really want to set that order, try async/await or calling fs.readFile() inside the setTimeout() callback.
Based on node.js event loop documentation, a setTimeout is called which sets a minimum amount of time to wait until the function is executed. Next you start an asynchronous file read. Then the polling phase of the event loop will begin. The polling phase has a queue of callback functions to complete. If the file read is not complete, it's callback function (the console.log) will not be put in the polling queue. The polling phase will instead wait for the timer to meet the minimum threshold time and then loop back to the timer callback execution phase. Since your setTimeout is set to zero ms, the polling phase will usually exit and complete the timer callback (console.log) first and then go back to polling for the file read to finish. This is why sometimes your setTimeout completes first if your underlying operating system delays the file read, or the file read completes first if the operating system delays the setTimeout. setImmediate will always happen immediately after the polling (io) phase.

Does the nodejs (libuv) event loop execute all the callbacks in one phase (queue) before moving to the next or run in a round robin fashion?

I am studying about the event loop provided by libuv in Node. I came across the following blog by Deepal Jayasekara and also saw the explanations of Bert Belder and Daniel Khan on youtube.
There is one point that I am not clear with- As per my understanding, the event loop processes all the items of one phase before moving on to another. So if that is the case, I should be able to block the event loop if the setTimeout phase is constantly getting callbacks added to it.
However, when I tried to replicate that- it doesn't happen. The following is the code:
var http = require('http');
http.createServer(function (req, res) {
res.writeHead(200, {'Content-Type': 'text/plain'});
res.write('Hello World!');
console.log("Response sent");
res.end();
}).listen(8081);
setInterval(() => {
console.log("Entering for loop");
// Long running loop that allows more callbacks to get added to the setTimeout phase before this callback's processing completes
for (let i = 0; i < 7777777777; i++);
console.log("Exiting for loop");
}, 0);
The event loop seems to run in a round robin fashion. It first executes the callbacks that were added before i send a request to the server, then processes the request and then continues with the callbacks. It feels like a single queue is running.
From the little that I understood, there isn't a single queue and all the expired timer callbacks should get executed first before moving to the next phase. Hence the above snippet should not be able to return the Hello World response.
What could be the possible explanation for this?
Thanks.
If you look in libuv itself, you find that the operative part of running timers in the event loop is the function uv_run_timers().
void uv__run_timers(uv_loop_t* loop) {
struct heap_node* heap_node;
uv_timer_t* handle;
for (;;) {
heap_node = heap_min(timer_heap(loop));
if (heap_node == NULL)
break;
handle = container_of(heap_node, uv_timer_t, heap_node);
if (handle->timeout > loop->time)
break;
uv_timer_stop(handle);
uv_timer_again(handle);
handle->timer_cb(handle);
}
}
The way it works is the event loop sets a time mark at the current time and then it processes all timers that are due by that time one after another without updating the loop time. So, this will fire all the timers that are already past their time, but won't fire any new timers that come due while it's processing the ones that were already due.
This leads to a bit fairer scheduling as it runs all timers that are due, then goes and runs the rest of the types of events in the event loop, then comes back to do any more timers that are due again. This will NOT process any timers that are not due at the start of this event loop cycle, but come due while it's processing other timers. Thus, you see the behavior you asked about.
The above function is called from the main part of the event loop with this code:
int uv_run(uv_loop_t *loop, uv_run_mode mode) {
DWORD timeout;
int r;
int ran_pending;
r = uv__loop_alive(loop);
if (!r)
uv_update_time(loop);
while (r != 0 && loop->stop_flag == 0) {
uv_update_time(loop); <== establish loop time
uv__run_timers(loop); <== process only timers due by that loop time
ran_pending = uv_process_reqs(loop);
uv_idle_invoke(loop);
uv_prepare_invoke(loop);
.... more code here
}
Note the call to uv_update_time(loop) right before calling uv__run_timers(). That sets the timer that uv__run_timers() references. Here's the code for uv_update_time():
void uv_update_time(uv_loop_t* loop) {
uint64_t new_time = uv__hrtime(1000);
assert(new_time >= loop->time);
loop->time = new_time;
}
from the docs,
when the event loop enters a
given phase, it will perform any operations specific to that phase,
then execute callbacks in that phase's queue until the queue has been
exhausted or the maximum number of callbacks has executed. When the
queue has been exhausted or the callback limit is reached, the event
loop will move to the next phase, and so on.
Also from the docs,
When delay is larger than 2147483647 or less than 1, the delay will be set to 1
Now, when when you run your snippet following things happen,
script execution begins and callbacks are registered to specific phases. Also, as the docs suggests the setInterval delay is implicitly converted to 1 sec.
After 1 sec, your setInterval callback will be executed, it will block eventloop until all iterations and completed. Meanwhile, eventloop will not be notified of any incoming request atleast until loop terminates.
Once, all iterations are completed, and there is a timeout of 1 sec, the poll phase will execute your HTTP request callback, if any.
back to step 2.

Nodejs Event Loop - interaction with top-level code

hoping for a little confirmation on understanding of node.js execution model. I understand that when node.js process starts, this is the sequence of executions:
(from Jonas Schmedtmann's Udemy node.js course)
With the main takeaway being that top-level code is always executed first before any callbacks.
Then, in the event-loop, this is the sequence of the 'phases':
After some digging, I also confirmed why a setTimeout and a setImmediate called in the main module has 'arbitrary' execution order, but when called from the I/O phase, the setImmediate will always execute first, based on this post: https://github.com/nodejs/help/issues/392#issuecomment-274032320.
(Reason: assuming the timer threshold has already passed, since we are currently in the I/O phase, and the next phase after that is the check-handles phase where setImmediate callbacks are executed, immediate always executes before timer.)
Now, when timer and immediate callbacks are called from a phase such that the next phase is the due-timers phase (such as from main module), if the top-level code took long enough that the timer is due, the timer callback will always execute first, correct? I've tested this with the following code, and it seems to be true (everytime I've run it, timer executes first, even though it has a full second delay compared to the immediate callback)
setTimeout(() => {
console.log('timer completed');
}, 1000);
setImmediate(() => {
console.log('immediate completed');
});
for (let i = 0; i < 5000; i++) {
console.log(`top-level code: ${i}`);
}
So here is my question: shouldn't an I/O operation callback also be executed before the immediate's callback due to the event-loop, assuming that the top-level code takes long enough that the I/O operation completes by the time we start the event-loop?
However, this code below suggests otherwise, as the execution order is always: top-levels->timer->immediate->io
Even though based on the model above I should be expecting: top-levels->timer->io->immediate (?)
setTimeout(() => {
console.log('timer completed');
}, 1000);
fs.readFile('test-file.txt', 'utf-8', () => {
console.log('io completed');
});
setImmediate(() => {
console.log('immediate completed');
});
for (let i = 0; i < 5000; i++) {
console.log(`top-level code: ${i}`);
}
Thank you!
I might be a little late to answering this question and it is possible you've already figured this one out #M.Lee. But here goes the answer to your question:
During the top-level code execution the code you're running in your example is not running in the event loop. Like the first image from your question shows, event loop will start ticking after the top level code is already executed. So, when it comes to the top-level code, Node does not follow the same order that it follows during an event loop tick.
In this particular case, the I/O callback is getting executed the last is plainly because the contents of this particular file (BTW, I had to go ahead and do the research by looking at Jonas' Node course and understand what this file contained. It just contains the line "Node.js is the best!" 1 million times).
Also a side note here is that you're using the asynchronous readFile function instead of the readFileSync.

Does Javascript event queue have priority?

These days, I have read some documents about setTimeout and setInterval. I have learned that the Javascript is a single thread which will only execute one piece of code per time. At the same time, if there is a event happens, it will be pushed into the event queue and block until appropriate time. I want to know, when many events are blocked waiting to execute at the same time. Do these events have different priorities, so the high priority event will execute before the low ones. Or just a FIFO queue.
setTimeout(fn1, 10);
$(document).click(fn2); //will be called at 6ms;
$.ajax({ajaxSuccess(fn3); //async request,it uses 7ms;})
for () {
//will run 18ms;
};
In the above code, the setTimeout fn1 will happen at 10 ms,click event handler fn2 will at 6ms, ajax callback fn3 will at 7ms, but all the three functions will be blocked until the for loop finish. At 18ms, the for loop finished, so what order does these functions will be invoked.(fn1,fn2,fn3) or (fn2,fn3,fn1)
Work scheduled for the main JavaScript thread is processed FIFO. This includes callbacks from various async tasks, such as setTimeout and ajax completions, and event handlers. The only exception is that in the main popular environments (browsers and Node), the resolution callback of a native Promise jumps the queue (more accurately, goes in a different higher-priority queue), see my answer here for details on that.
But leaving aside native promise resolution callbacks:
but all the three functions will be blocked until the for loop finish. At 18ms, the for loop finished, so what order does these functions will be invoked. (fn1,fn2,fn3) or (fn2,fn3,fn1)
The time you give setTimeout is approximate, because when that time comes due the JavaScript UI thread may be busy doing something else (as you know); there's also a minimum time required by the (newish) spec, but the degree to which it's enforced varies by implementation. Similarly, you can't guarantee that the click event will be queued at 6ms, or that the ajax completion will occur at exactly 7ms.
If that code started, and the browser did the 10ms precisely, and the click event was queued exactly 6ms in, and the ajax request completed at exactly 7ms, then the order would be: fn2 (the click handler), fn3 (the ajax completion), fn1 (the setTimeout), because that's the order in which they'd be queued.
But note that these are extremely tight timings. In practice, I expect the order the callbacks were queued would be effectively random, because the timing of the click would vary, the timing of the ajax would vary, etc.
I think this is a better example:
var start = +new Date();
// Queue a timed callback after 20ms
setTimeout(function() {
display("20ms callback");
}, 20);
// Queue a timed callback after 30ms
setTimeout(function() {
display("30ms callback");
}, 30);
// Queue a timed callback after 10ms
setTimeout(function() {
display("10ms callback");
}, 10);
// Busy-wait 40ms
display("Start of busy-wait");
var stop = +new Date() + 40;
while (+new Date() < stop) {
// Busy-wait
}
display("End of busy-wait");
function display(msg) {
var p = document.createElement('p');
var elapsed = String(+new Date() - start);
p.innerHTML = "+" + "00000".substr(elapsed.length - 5) + elapsed + ": " + msg;
document.body.appendChild(p);
}
The order of output will be the two loop messages followed by the 10ms callback, 20ms callback, and 30ms callback, because that's the order in which the callbacks are queued for servicing by the main JavaScript thread. For instance:
+00001: Start of busy-wait
+00041: End of busy-wait
+00043: 10ms callback
+00044: 20ms callback
+00044: 30ms callback
...where the + numbers indicate milliseconds since the script started.
Does javascript event queue have priority?
Sort of. The event loop is actually composed of one or more event queues. In each queue, events are handled in a FIFO order.
It's up to the browser to decide how many queues to have and what form of prioritisation to give them. There's no Javascript interface to individual event queues or to send events to a particular queue.
https://www.w3.org/TR/2014/REC-html5-20141028/webappapis.html#event-loops
Each task is defined as coming from a specific task source. All the tasks from one particular task source and destined to a particular event loop (e.g. the callbacks generated by timers of a Document, the events fired for mouse movements over that Document, the tasks queued for the parser of that Document) must always be added to the same task queue, but tasks from different task sources may be placed in different task queues.
For example, a user agent could have one task queue for mouse and key events (the user interaction task source), and another for everything else. The user agent could then give keyboard and mouse events preference over other tasks three quarters of the time, keeping the interface responsive but not starving other task queues, and never processing events from any one task source out of order.
FIFO. There really isn't anything more to say. You don't get to schedule it.
This can be a bit of a pain when you're looking at multiple timeout operations that could conceivably happen at the same time. That said, if you're using Asynchronous behaviors you shouldn't be depending on how they get scheduled.

Categories

Resources