I test the code
///create the web server
http.createServer(app).listen(app.get('port'), function(){
console.log('Express server listening on port ' + app.get('port'));
});
///
function compute() {
var x = 1111111112;
while (x > 0) x--;//about 6 seconds
setImmediate(compute);//then do it again
}
compute();
Let's say we have a task compute() which needs to run almost continuously, and does some CPU intensive calculations. If we wanted to also handle other events, like serving HTTP requests in the same Node process, I find it impossible to use process.nextTick() adjust CPU time on the JS Thread because process.nextTick's oberver in event loop is idle observer .
The observer priority is listed:
idle observer //process.nextTick has the most highest priority
I/O observer // The second is I/O observer includes web request and other I/O operations
check observer // setImmediate() function's observer is check observer
So I use setImmediate to make the whole compute cut into pieces of work so that other Observer(idle and I/O) could deal their events like a request before Javascript continue running compute() function.
The result is strange :
When a request comes , it's dealed but not at one time.Since I set the compute()'s computing time at about 6 seconds, why my browser won't get the result until so many seconds pass(larger than 6 sec.)
So I start to decrease the computing time at about 50 ms which is much more smaller than before.
///create the web server
http.createServer(app).listen(app.get('port'), function(){
console.log('Express server listening on port ' + app.get('port'));
});
///
function compute() {
var x = 11111112;//I decrease this number for reducing the running time of compute();
while (x > 0) x--;//about 50 ms
setImmediate(compute);//then do it again
}
compute();
Then everything works quickly.
What I worry is: the request should be dealed after 6 seconds in a whole operation and why it last for a long time?
I always think the running model is like below:
Compute 6 seconds
If Event loop find a I/O observer ( a web request) waiting, deal the request and response to the client browser
then event loop continue to do his work like compute()
Loop above again
But it seems node.js cut the request operation into many pieces which result in dealing the request with such a long time?
And am I wrong? Any help will be appreciated.
This is the fundamental concept of node.js. Node.js is event based. Why is this so important? Because it explains how Node can be asynchronous and have non-blocking I/O.
Whenever task starts, it goes to the event loop queue then node.js interpreter sends the request to the event loop, event loop checks whether the task is I/O or non I/O blocking. If its I/O blocking request then event loop sends it to the thread pool which is managed by node.js library (libuv). A thread pool manages a bunch of tasks like networking, DB operation, filesystem, and others. Event loop always runs on the main thread because node.js is a single thread. All the I/O operations run on the thread pool. When task is completed in the thread poll it calls the callback and passes it to the queue again main thread pulls the request from the queue and process it at the same.
Related
this question might be duplicated but I am still not getting the answer. I am fairly new to node.js so I might need some help. Many have said that node.js is perfectly free to run incoming requests asynchronously, but the code below shows that if multiple requests hit the same endpoint, say /test3, the callback function will:
Print "test3"
Call setTimeout() to prevent blocking of event loop
Wait for 5 seconds and send a response of "test3" to the client
My question here is if client 1 and client 2 call /test3 endpoint at the same time, and the assumption here is that client 1 hits the endpoint first, client 2 has to wait for client 1 to finish first before entering the event loop.
Can anybody here tells me if it is possible for multiple clients to call a single endpoint and run concurrently, not sequentially, but something like 1 thread per connection kind of analogy.
Of course, if I were to call other endpoint /test1 or /test2 while the code is still executing on /test3, I would still get a response straight from /test2, which is "test2" immediately.
app.get("/test1", (req, res) => {
console.log("test1");
setTimeout(() => res.send("test1"), 5000);
});
app.get("/test2", async (req, res, next) => {
console.log("test2");
res.send("test2");
});
app.get("/test3", (req, res) => {
console.log("test3");
setTimeout(() => res.send("test3"), 5000);
});
For those who have visited, it has got nothing to do with blocking of event loop.
I have found something interesting. The answer to the question can be found here.
When I was using chrome, the requests keep getting blocked after the first request. However, with safari, I was able to hit the endpoint concurrently. For more details look at the following link below.
GET requests from Chrome browser are blocking the API to receive further requests in NODEJS
Run your application in cluster. Lookup Pm2
This question needs more details to be answer and is clearly an opinion-based question. just because it is an strawman argument I will answer it.
first of all we need to define run concurrently, it is ambiguous if we assume the literal meaning in stric theory nothing RUNS CONCURRENTLY
CPUs can only carry out one instruction at a time.
The speed at which the CPU can carry out instructions is called the clock speed. This is controlled by a clock. With every tick of the clock, the CPU fetches and executes one instruction. The clock speed is measured in cycles per second, and 1c/s is known as 1 hertz. This means that a CPU with a clock speed of 2 gigahertz (GHz) can carry out two thousand million (or two billion for those in the US) for the rest of us/world 2000 million cycles per second.
cpu running multiple task "concurrently"
yes you're right now-days computers even cell phones comes with multi core which means the number of tasks running at the same time will depend upon the number of cores, but If you ask any expert such as this Associate Staff Engineer AKA me will tell you that is very very rarely you'll find a server with more than one core. why would you spend 500 USD for a multi core server if you can spawn a hold bunch of ...nano or whatever option available in the free trial... with kubernetes.
Another thing. why would you handle/configurate node to be incharge of the routing let apache and/or nginx to worry about that.
as you mentioned there is one thing call event loop which is a fancy way of naming a Queue Data Structure FIFO
so in other words. no, NO nodejs as well as any other programming language out there will run
but definitly it depends on your infrastructure.
Does node js execute multiple commands in parallel or execute one command (and finish it!) and then execute the second command?
For example if multiple async functions use the same Stack, and they push & pop "together", can I get strange behaviour?
Node.js runs your main Javascript (excluding manually created Worker Threads for now) as a single thread. So, it's only ever executing one piece of your Javascript at a time.
But, when a server request contains asynchronous operations, what happens in that request handle is that it starts the asynchronous operation and then returns control back to the interpreter. The asynchronous operation runs on its own (usually in native code). While all that is happening, the JS interpreter is free to go back to the event loop and pick up the next event waiting to be run. If that's another incoming request for your server, it will grab that request and start running it. When it hits an asynchronous operation and returns back to the interpreter, the interpreter then goes back to the event loop for the next event waiting to run. That could either be another incoming request or it could be one of the previous asynchronous operations that is now ready to run it's callback.
In this way, node.js makes forward progress on multiple requests at a time that involve asynchronous operations (such as networking, database requests, file system operations, etc...) while only ever running one piece of your Javascript at a time.
Starting with node v10.5, nodejs has Worker Threads. These are not automatically used by the system yet in normal service of networking requests, but you can create your own Worker Threads and run some amount of Javascript in a truly parallel thread. This probably isn't need for code that is primarily I/O bound because the asynchronous nature of I/O in Javascript already gives it plenty of parallelism. But, if you had CPU-intensive operations (heavy crypto, image analysis, video compression, etc... that was done in Javascript), Worker Threads may definitely be worth adding for those particular tasks.
To show you an example, let's look at two request handlers, one that reads a file from disk and one that fetches some data from a network endpoint.
app.get("/getFileData", (req, res) => {
fs.readFile("myFile.html", function(err, html) {
if (err) {
console.log(err);
res.sendStatus(500);
} else {
res.type('html').send(html);
}
})
});
app.get("/getNetworkData", (req, res) => {
got("http://somesite.com/somepath").then(result => {
res.json(result);
}).catch(err => {
console.log(err);
res.sendStatus(500);
});
});
In the /getFileData request, here's the sequence of events:
Client sends request for http://somesite.com/getFileData
Incoming network event is processed by the OS
When node.js gets to the event loop, it sees an event for an incoming TCP connection on the port its http server is listening for and calls a callback to process that request
The http library in node.js gets that request, parses it, and notifies the observes of that request, once of which will be the Express framework
The Express framework matches up that request with the above request handler and calls the request handler
That request handler starts to execute and calls fs.readFile("myfile.html", ...). Because that is asynchronous, calling the function just initiates the process (carrying out the first steps), registers its completion callback and then it immediately returns.
At this point, you can see from that /getFileData request handler that after it calls fs.readFile(), the request handler just returns. Until the callback is called, it has nothing else to do.
This returns control back to the nodejs event loop where nodejs can pick out the next event waiting to run and execute it.
In the /getNetworkData request, here's the sequence of events
Steps 1-5 are the same as above.
6. The request handler starts to execute and calls got("http://somesite.com/somepath"). That initiates a request to that endpoint and then immediately returns a promise. Then, the .then() and .catch() handlers are registered to monitor that promise.
7. At this point, you can see from that /getNetworkData request handler that after it calls got().then().catch(), the request handler just returns. Until the promise is resolved or rejected, it has nothing else to do.
8. This returns control back to the nodejs event loop where nodejs can pick out the next event waiting to run and execute it.
Now, sometime in the future, fs.readFile("myFile.html", ...) completes. At this point, some internal sub-system (that may use other native code threads) inserts a completion event in the node.js event loop.
When node.js gets back to the event loop, it will see that event and run the completion callback associated with the fs.readFile() operation. That will trigger the rest of the logic in that request handler to run.
Then, sometime in the future the network request from got("http://somesite.com/somepath") will complete and that will trigger an event in the event loop to call the completion callback for that network operation. That callback will resolve or reject the promise which will trigger the .then() or .catch() callbacks to be called and the second request will execute the rest of its logic.
Hopefully, you can see from these examples how request handlers initiate an asynchronous operation, then return control back to the interpreter where the interpreter can then pull the next event from the event loop and run it. Then, as asynchronous operations complete, other things are inserted into the event loop causing further progress to run on each request handler until eventually they are done with their work. So, multiple sections of code can be making progress without more than one piece of code every running at the same time. It's essentially cooperative multi-tasking where the time slicing between operations occurs at the boundaries of asynchronous operations, rather than an automatic pre-emptive time slicing in a fully threaded system.
Nodejs gets a number of advantages from this type of multi-tasking as it's a lot, lot lower overhead (cooperative task switching is a lot more efficient than time-sliced automatic task switching) and it also doesn't have most of the usual thread synchronization issues that true multi-threaded systems do which can make them a lot more complicated to code and/or more prone to difficult bugs.
I'm creating an nodejs web processor. I's is processing time that takes ~ 1 minute. I POST to my server and get status by using GET
this is my simplified code
// Configure Express
const app = express();
app.listen(8080);
// Console
app.post('/clean, async function(req, res, next) {
// start proccess
let result = await worker.process(data);
// Send result when finish
res.send(result);
});
// reply with when asked
app.get('/clean, async function(req, res, next) {
res.send(worker.status);
});
The problem is. The server is working so hard in the POST /clean process that GET /clean are not replied in time.
All GET /clean requests are replied after the worker finishes its task and free the processor to respond the request.
In other words. The application are unable to respond during workload.
How can I get around this situation?
Because node.js runs your Javascript as single threaded (only one piece of Javascript ever running at once) and does not time slice, as long as your worker.process() is running it's synchronous code, no other requests can be processed by your server. This is why worker.process() has to finish before any of the http requests that arrived while it was running get serviced. The node.js event loop is busy until worker.process() is done so it can't service any other events (like incoming http requests).
These are some of the ways to work around that:
Cluster your app with the built-in cluster module so that you have a bunch of processes that can either work on worker.process() code or handle incoming http requests.
When it's time to call worker.process(), fire up a new node.js process, run the processing there and communicate back the result with standard interprocess communication. Then, your main node.js process stays reading to handle incoming http requests near instantly as they arrive.
Create a work queue of a group of additional node.js processes that run jobs that are put in the queue and configure these processes to be able to run your worker.process() code from the queue. This is a variation of #2 that bounds the number of processes and serializes the work into a queue (better controlled than #2).
Rework the way worker.process() does its work so that it can do a few ms of work at a time, then return back to the message loop so other events can run (like incoming http requests) and then resume it's work afterwards for a few more ms at a time. This usually requires building some sort of stateful object that can do a little bit of work at a time each time it is called, but is often a pain to program effectively.
Note that #1, #2 and #3 all require that the work be done in other processes. That means that the process.status() will need to get the status from those other processes. So, you will either need some sort of interprocess way of communicating with the other processes or you will need to store the status as you go in some storage that is accessible from all processes (such as redis) so it can just be retrieved from there.
There's no working around the single-threaded nature of JS short of converting your service to a cluster of processes or to use something experimental like Worker Threads.
If neither of these options work for you, you'll need to yield up the processing thread periodically to give other tasks the ability to work on things:
function workPart1() {
// Do a bunch of stuff
setTimeout(workPart2, 10);
}
function workPart2() {
// More stuff
setTimeout(workPart3, 10); // etc.
}
I have started reading a lot about Node JS lately and one thing I am not able to clearly understand from differentiation perspective is what is the real difference between how I/O is handled by Asynchronous Vs Synchronous call.
As I understand, in a multi threaded synchronous environment , if I/O is started , the running thread is preempted and moves back to a waiting state.So essentially this is same as what happens with NodeJS asynchronous I/O call. In Node JS, when I/O is called, the I/O operation is moved out of current running thread and sent to event De-multiplexer for completion and notification. As soon as I/O is complete the callback method is pushed to event Queue for further processing.
So , the only difference I see is that in Node JS we are saving memory (due to multiple call stacks owned by each thread) and CPU ( saved because of no context switching). If I just consider that I have enough memory to buy , does saving of CPU due to context switching alone is making the huge performance difference?
If my above understanding is not correct, how does I/O handling is any different between a java thread Vs Node JS w.r.t to keeping the CPU busy and not wasting CPU cycles.Are we saving only context switching CPU cycles with Node JS or there is more to that?
Based on the responses , I would like to add another scenario:
Request A , Request B comes to J2ee server at the same time. Each request takes 10 ms to complete in this multi threaded environment.Out of 10 ms , 5 ms is spent in executing the code logic to compute some logic and 5 ms is spent in I/O for pulling a large dataset from a DBMS.The call to DBMS is the last line of the code after which the response should be sent to the client.
If same application converted to a node JS application, this is what might happen
Request A comes, 5 ms is used for processing the request.
DBMS call is hit from the code but it's non blocking.So a callback method
is pushed to event Queue.
After 5 ms, Request B is served and again
request B is pushed to event Queue for I/O completion.Request B
takes 5 ms for processing.
Event Loop runs, pickups callback
handler for request A, which then sends the response to client.So
the response is sent after 10 ms because Req A and Req B both took 5
ms for synchronous code block processing.
Now where is the time saved in such a scenario?Apart from context switching and creating 2 threads. Req A & Req B both took 10 ms anyway with Node JS. ?
As I understand, in a multi threaded synchronous environment , if I/O
is started , the running thread is preempted and moves back to a
waiting state.So essentially this is same as what happens with NodeJS
asynchronous I/O call.
Nope, in NodeJS an asynchronous I/O call is a non-blocking I/O. Which means that once the thread has made an I/O call it doesn't wait for the I/O to complete and move on to the next statement/task to execute.
Once the I/O completes it picks up the next task from event-loop-queue and eventually executes callback handler which was given to it while making the I/O call.
If I just consider that I have enough memory to buy , does saving of
CPU due to context switching alone is making the huge performance
difference?
Apart from this, saving is also coming from these two things
Not-Having-To-Wait for the I/O to complete
Not-Having-To-Make-Threads since threads are limited, so the system's capacity is not limited by how many threads it can make.
Apart from context switching and creating 2 threads. Req A & Req B
both took 10 ms anyway with Node JS. ?
You are discounting one thing here - Thread is getting two request one after the other after a specific interval. So if one thread is going to take 10 seconds, then it will take a new thread to execute the second request. Extapolate this to thousands of requests and your OS having to make thousands of threads to deal with so many concurrent users. Refer to this analogy.
Consider this simple example:
var BinaryServer = require('../../').BinaryServer;
var fs = require('fs');
// Start Binary.js server
var server = BinaryServer({port: 9000});
// Wait for new user connections
server.on('connection', function(client){
// Stream a flower image!
var file = fs.createReadStream(__dirname + '/flower.png');
client.send(file);
sleep_routine(5);//in seconds
});
When a client connects to the server I block the event for about 5 seconds (imagine that time has some complex operations). What is expect to happen if another client connects (meanwhile)? One thing that I read about NodeJS is non-blocking I/O. But in this case the second client only receive the flower after the sleeping of the first, right?
One thing that I read about NodeJS is non-blocking I/O. But in this case the second client only receive the flower after the sleeping of the first, right?
That's correct, assuming that you are doing blocking synchronous operations for five seconds straight. If you do any file system IO, or any IO for that matter, or use a setTimeout, then the other client will get their opportunity to use the thread and get the flower image. So, if you're doing really heavy cpu intensive processing, you have a few choices:
Fire it off in a separate process that runs asynchronously, EG using the built-in child_process module
Keep track of how long you've been processing for and every 100ms or so give up the thread by saving your state, and then using setTimeout to continue processing where you left off
Have multiple node processes already running, so that if one is busy there is another that can serve the second user (EG. behind a load balancer, or using the cluster module)
I would recommend a combination of 1 and 3 if this is ever a problem; but so much of node can be made asynchronous that it rarely is. Even things like computing password hashes can be done asynchronously
No - the two requests will be handled independently. So if the first request had to wait 5 seconds, and for some reason the second request only took 2 seconds, the second would return before the first.
In practice you would have your server connection called a second time before the first one had finished. But since they have all different state you would normally be unaware of that.