nodejs: setTimeout and IO(POLL) is showing inconsistency - javascript

I have this piece of code:
// * Run this snippet of code multiple times
const fs = require('fs');
setTimeout(() => {
console.log('timer');
});
fs.readFile('', 'utf-8',(err, data) => {
console.log('io');
});
setImmediate(() => {
console.log('check');
});
On running the above mentioned code for multiple times. I'm getting different outputs.
Result 1
Somethimes I'm getting
timer
io
check
Result 2
and other times. I'm getting
io
check
timer
Can anyone please clarify what is going on here? I was expecting Result 1.

That has to do with how the event loop picks up things to do when more than one thing is available. Here you can find a good explanation.
If you really want to set that order, try async/await or calling fs.readFile() inside the setTimeout() callback.

Based on node.js event loop documentation, a setTimeout is called which sets a minimum amount of time to wait until the function is executed. Next you start an asynchronous file read. Then the polling phase of the event loop will begin. The polling phase has a queue of callback functions to complete. If the file read is not complete, it's callback function (the console.log) will not be put in the polling queue. The polling phase will instead wait for the timer to meet the minimum threshold time and then loop back to the timer callback execution phase. Since your setTimeout is set to zero ms, the polling phase will usually exit and complete the timer callback (console.log) first and then go back to polling for the file read to finish. This is why sometimes your setTimeout completes first if your underlying operating system delays the file read, or the file read completes first if the operating system delays the setTimeout. setImmediate will always happen immediately after the polling (io) phase.

Related

Does the nodejs (libuv) event loop execute all the callbacks in one phase (queue) before moving to the next or run in a round robin fashion?

I am studying about the event loop provided by libuv in Node. I came across the following blog by Deepal Jayasekara and also saw the explanations of Bert Belder and Daniel Khan on youtube.
There is one point that I am not clear with- As per my understanding, the event loop processes all the items of one phase before moving on to another. So if that is the case, I should be able to block the event loop if the setTimeout phase is constantly getting callbacks added to it.
However, when I tried to replicate that- it doesn't happen. The following is the code:
var http = require('http');
http.createServer(function (req, res) {
res.writeHead(200, {'Content-Type': 'text/plain'});
res.write('Hello World!');
console.log("Response sent");
res.end();
}).listen(8081);
setInterval(() => {
console.log("Entering for loop");
// Long running loop that allows more callbacks to get added to the setTimeout phase before this callback's processing completes
for (let i = 0; i < 7777777777; i++);
console.log("Exiting for loop");
}, 0);
The event loop seems to run in a round robin fashion. It first executes the callbacks that were added before i send a request to the server, then processes the request and then continues with the callbacks. It feels like a single queue is running.
From the little that I understood, there isn't a single queue and all the expired timer callbacks should get executed first before moving to the next phase. Hence the above snippet should not be able to return the Hello World response.
What could be the possible explanation for this?
Thanks.
If you look in libuv itself, you find that the operative part of running timers in the event loop is the function uv_run_timers().
void uv__run_timers(uv_loop_t* loop) {
struct heap_node* heap_node;
uv_timer_t* handle;
for (;;) {
heap_node = heap_min(timer_heap(loop));
if (heap_node == NULL)
break;
handle = container_of(heap_node, uv_timer_t, heap_node);
if (handle->timeout > loop->time)
break;
uv_timer_stop(handle);
uv_timer_again(handle);
handle->timer_cb(handle);
}
}
The way it works is the event loop sets a time mark at the current time and then it processes all timers that are due by that time one after another without updating the loop time. So, this will fire all the timers that are already past their time, but won't fire any new timers that come due while it's processing the ones that were already due.
This leads to a bit fairer scheduling as it runs all timers that are due, then goes and runs the rest of the types of events in the event loop, then comes back to do any more timers that are due again. This will NOT process any timers that are not due at the start of this event loop cycle, but come due while it's processing other timers. Thus, you see the behavior you asked about.
The above function is called from the main part of the event loop with this code:
int uv_run(uv_loop_t *loop, uv_run_mode mode) {
DWORD timeout;
int r;
int ran_pending;
r = uv__loop_alive(loop);
if (!r)
uv_update_time(loop);
while (r != 0 && loop->stop_flag == 0) {
uv_update_time(loop); <== establish loop time
uv__run_timers(loop); <== process only timers due by that loop time
ran_pending = uv_process_reqs(loop);
uv_idle_invoke(loop);
uv_prepare_invoke(loop);
.... more code here
}
Note the call to uv_update_time(loop) right before calling uv__run_timers(). That sets the timer that uv__run_timers() references. Here's the code for uv_update_time():
void uv_update_time(uv_loop_t* loop) {
uint64_t new_time = uv__hrtime(1000);
assert(new_time >= loop->time);
loop->time = new_time;
}
from the docs,
when the event loop enters a
given phase, it will perform any operations specific to that phase,
then execute callbacks in that phase's queue until the queue has been
exhausted or the maximum number of callbacks has executed. When the
queue has been exhausted or the callback limit is reached, the event
loop will move to the next phase, and so on.
Also from the docs,
When delay is larger than 2147483647 or less than 1, the delay will be set to 1
Now, when when you run your snippet following things happen,
script execution begins and callbacks are registered to specific phases. Also, as the docs suggests the setInterval delay is implicitly converted to 1 sec.
After 1 sec, your setInterval callback will be executed, it will block eventloop until all iterations and completed. Meanwhile, eventloop will not be notified of any incoming request atleast until loop terminates.
Once, all iterations are completed, and there is a timeout of 1 sec, the poll phase will execute your HTTP request callback, if any.
back to step 2.

Nodejs Event Loop - interaction with top-level code

hoping for a little confirmation on understanding of node.js execution model. I understand that when node.js process starts, this is the sequence of executions:
(from Jonas Schmedtmann's Udemy node.js course)
With the main takeaway being that top-level code is always executed first before any callbacks.
Then, in the event-loop, this is the sequence of the 'phases':
After some digging, I also confirmed why a setTimeout and a setImmediate called in the main module has 'arbitrary' execution order, but when called from the I/O phase, the setImmediate will always execute first, based on this post: https://github.com/nodejs/help/issues/392#issuecomment-274032320.
(Reason: assuming the timer threshold has already passed, since we are currently in the I/O phase, and the next phase after that is the check-handles phase where setImmediate callbacks are executed, immediate always executes before timer.)
Now, when timer and immediate callbacks are called from a phase such that the next phase is the due-timers phase (such as from main module), if the top-level code took long enough that the timer is due, the timer callback will always execute first, correct? I've tested this with the following code, and it seems to be true (everytime I've run it, timer executes first, even though it has a full second delay compared to the immediate callback)
setTimeout(() => {
console.log('timer completed');
}, 1000);
setImmediate(() => {
console.log('immediate completed');
});
for (let i = 0; i < 5000; i++) {
console.log(`top-level code: ${i}`);
}
So here is my question: shouldn't an I/O operation callback also be executed before the immediate's callback due to the event-loop, assuming that the top-level code takes long enough that the I/O operation completes by the time we start the event-loop?
However, this code below suggests otherwise, as the execution order is always: top-levels->timer->immediate->io
Even though based on the model above I should be expecting: top-levels->timer->io->immediate (?)
setTimeout(() => {
console.log('timer completed');
}, 1000);
fs.readFile('test-file.txt', 'utf-8', () => {
console.log('io completed');
});
setImmediate(() => {
console.log('immediate completed');
});
for (let i = 0; i < 5000; i++) {
console.log(`top-level code: ${i}`);
}
Thank you!
I might be a little late to answering this question and it is possible you've already figured this one out #M.Lee. But here goes the answer to your question:
During the top-level code execution the code you're running in your example is not running in the event loop. Like the first image from your question shows, event loop will start ticking after the top level code is already executed. So, when it comes to the top-level code, Node does not follow the same order that it follows during an event loop tick.
In this particular case, the I/O callback is getting executed the last is plainly because the contents of this particular file (BTW, I had to go ahead and do the research by looking at Jonas' Node course and understand what this file contained. It just contains the line "Node.js is the best!" 1 million times).
Also a side note here is that you're using the asynchronous readFile function instead of the readFileSync.

Why Node.js setImmediate executes after I/O callbacks?

As new member, I'm unable to comment on topics, that's why I had to create a new topic. But in this way, I can clarify the problem, so hopefully you guys can help me.
I have read quite a lot about Node.js Event Loop. And I have shaped my understanding of it based on following materials:
Node.js Event Loop
What the heck is the event loop anyway?
Why setImmediate() execute before fs.readFile() in Nodejs Event Loop's works?
(Please feel free to suggest other materials which are informative and accurate)
Especially the third link, has given me a better understanding. But keeping that in mind, I'm unable to understand Event Loop behavior for the following code:
var fs = require('fs');
var pos = 0;
fs.stat(__filename, function() {
console.log(++pos + " FIRST STAT");
});
fs.stat(__filename, function() {
console.log(++pos + " LAST STAT");
});
setImmediate(function() {
console.log(++pos + " IMMEDIATE")
})
console.log(++pos + "LOGGER");
Surprisingly, for me output is as follow:
LOGGER
FIRST STAT
LAST STAT
IMMEDIATE
screenshot of my terminal, showing output as well as node version
screenshot of output from online code compiler rextester.com
Keeping the Event Loop Diagram in mind, I guess flow should be as follow:
Interpretor firstly starts two stat operations.
Interpreter en-queues setImmedate callback (event) in the setImmedate queue
Call stack logs the logger
All event queues before I/O Poll phase are empty, so Event Loop(EL) moves on
In I/O Polling phase, EL collects the events and enqueues both the fs.stat callbacks in the "run completed I/O handlers" phase
EL checks the Check phase, and run the setImmediate callback
This round of EL ends, and second round starts
In "run completed I/O handlers", EL runs both callbacks (order of them can is onn-determinstic)
Question 1: Which part of my analysis/prediction is wrong?
Question 2: At which point, does Event Loop start working? Does it start from the beginning of the app (i.e. stage 1)? or does it start once the whole code is read by interpreter, all sync tasks are done within Call Stack, and Call Stack needs more task, i.e. between stage 3-4?
Thanks in advance,
setImmediate = execute without wait any I/O
In https://nodejs.org/docs/v8.9.3/api/timers.html#timers_setimmediate_callback_args says:
Schedules the "immediate" execution of the callback after I/O events' callbacks. Returns an Immediate for use with clearImmed
Steps:
callback for First stat is queued in I/O queue
callback for Last stat is queued in I/O queue
callback for immediate is queued in Immediates queue
LOGGER
If I/O operations (in 1 and 2) are finished the callbacks in 1 and/or 2 are marked as ready to execute
Execute the ready callbacks one by one (first timmer, then I/O, finally immediates). In your case:
First stat
Last stat
LOGGER
In the case that I/O does'nt ends at 5. then LOGGER were execute before FIRST STAT and LAST STAT.
See also: https://jsblog.insiderattack.net/timers-immediates-and-process-nexttick-nodejs-event-loop-part-2-2c53fd511bb3#f3dd

Very fast endless loop without blocking I/O

Is there a faster alternative to window.requestAnimationFrame() for endless loops that don't block I/O?
What I'm doing in the loop isn't related to animation so I don't care when the next frame is ready, and I have read that window.requestAnimationFrame() is capped by the monitor's refresh rate or at least waits until a frame can be drawn.
I have tried the following as well:
function myLoop() {
// stuff in loop
setTimeout(myLoop, 4);
}
(The 4 is because that is the minimum interval in setTimeout and smaller values will still default to 4.) However, I need better resolution than this.
Is there anything with even better performance out there?
I basically need a non-blocking version of while(true).
Two things that will run sooner than that setTimeout:
process.nextTick callbacks (NodeJS-specific):
The process.nextTick() method adds the callback to the "next tick queue". Once the current turn of the event loop turn runs to completion, all callbacks currently in the next tick queue will be called.
This is not a simple alias to setTimeout(fn, 0). It is much more efficient. It runs before any additional I/O events (including timers) fire in subsequent ticks of the event loop.
Promise settlement notifications
So those might be a tools for your toolbelt, doing a mix of one or both of those with setTimeout to achieve the balance you want.
Details:
As you probably know, a given JavaScript thread runs on the basis of a task queue (the spec calls it a job queue); and as you probably know, there's one main default UI thread in browsers and NodeJS runs a single thread.
But in fact, there are at least two task queues in modern implementations: The main one we all think of (where setTimeout and event handlers put their tasks), and the "microtask" queue where certain async operations are placed during the processing of a main task (or "macrotask"). Those microtasks are processed as soon as the macrotask completes, before the next macrotask in the main queue — even if that next macrotask was queued before the microtasks were.
nextTick callbacks and promise settlement notifications are both microtasks. So scheduling either schedules an async callback, but one which will happen before the next main task.
We can see that in the browser with setInterval and a promise resolution chain:
let counter = 0;
// setInterval schedules macrotasks
let timer = setInterval(() => {
$("#ticker").text(++counter);
}, 100);
// Interrupt it
$("#hog").on("click", function() {
let x = 300000;
// Queue a single microtask at the start
Promise.resolve().then(() => console.log(Date.now(), "Begin"));
// `next` schedules a 300k microtasks (promise settlement
// notifications), which jump ahead of the next task in the main
// task queue; then we add one at the end to say we're done
next().then(() => console.log(Date.now(), "End"));
function next() {
if (--x > 0) {
if (x === 150000) {
// In the middle; queue one in the middle
Promise.resolve().then(function() {
console.log(Date.now(), "Middle");
});
}
return Promise.resolve().then(next);
} else {
return 0;
}
}
});
$("#stop").on("click", function() {
clearInterval(timer);
});
<div id="ticker"> </div>
<div><input id="stop" type="button" value="Stop"></div>
<div><input id="hog" type="button" value="Hog"></div>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
When you run that and click the Hog button, note how the counter display freezes, then keeps going again. That's because of the 300,000 microtasks that get scheduled ahead of it. Also note the timestamps on the three log messages we write (they don't appear in the snippet console until a macrotask displays them, but the timestamps show us when they were logged).
So basically, you could schedule a bunch of microtasks, and periodically let those run out and run the next macrotask.
Note: I've used setInterval for the browser example in the snippet, but setInterval, specifically, may not be a good choice for a similar experiment using NodeJS, as NodeJS's setInterval is a bit different from the one in browsers and has some surprising timing characteristics.
there are some libs that can work like cron task, e.g., https://www.npmjs.com/package/node-cron
i think that using cron should be easier, and more flexible.

Will the setTimeout callback on the job queue be run or cleared?

If the callback for a setTimeout invocation is added to the job queue (for example, if it is next on the job queue), and a clearTimeout is called on the current tick of the event loop, supplying the id of the original setTimeout invocation. Will the setTimeout callback on the job queue be run?
Or does the runtime magically remove the callback from the job queue?
No, it won't run; it will be queued and then subsequently aborted. The specificiation goes through a number of steps when you call setTimeout, one of which (after the minimum timeout, plus and user-agent padded timeouts etc) is eventually:
Queue the task task.
This appears to happen regardless of whether or not the handle that was returned in step 10 has been cleared - ie a call to setTimeout will always result in something being enqueued.
When you call clearTimeout, it:
must clear the entry identified as handle from the list of active timers
ie it doesn't directly affect the process already kicked off in the call to setTimeout. Note however that further up that process, task has been defined as:
Let task be a task that runs the following substeps:
If the entry for handle in the list of active timers has been cleared, then abort this task's substeps.
So when the task begins executing, it will first check if the handle has been cleared.
No, it won't be ran.
I don't know which source should be used to back it up officially, but it's at least easy to try for yourself.
function f() {
var t1 = setTimeout(function() { console.log("YES"); }, 2000);
sleep(3000);
clearTimeout(t1);
console.log("NO");
}

Categories

Resources