I am new into nodeJs (and JS), so can you explain me (or give a link) how to write simple service of nodeJs, which run permanently?
I want to write service, which sends a request every second to foreign API at store the results it DB.
So, maybe nodeJs have some simple module to run js method (methods) over and over again?
Or, I just have to write while loop and do it there?
setInterval() is what you would use to run something every second in node.js. This will call a callback every NN milliseconds where you pass the value of NN.
var interval = setInterval(function() {
// execute your request here
}, 1000);
If you want this to run forever, you will need to also handle the situation where the remote server you are contacting is off-line or having issues and is not as responsive as normal (for example, it may take more than a second for a troublesome request to timeout or fail). It might be safer to repeatedly use setTimeout() to schedule the next request 1 second after one finishes processing.
function runRequest() {
issueRequest(..., function(err, data) {
// process request results here
// schedule next request
setTimeout(runRequest, 1000);
})
}
// start the repeated requests
runRequest();
In this code block issueRequest() is just a placeholder for whatever networking operation you are doing (I personally would probably use the request() module and use request.get().
Because your request processing is asynchronous, this will not actually be recursion and will not cause a stack buildup.
FYI, a node.js process with an active incoming server or an active timer or an in-process networking operation will not exit (it will keep running) so as long as you always have a timer running or a network request running, your node.js process will keep running.
Related
I think we need some help here. Thanks in advance.
I have been doing programming in .Net for desktop applications and have used Timer objects to wait for a task to complete before the task result are shown in a data grid. Recently, we switched over to NodeJs and find it pretty interesting. We could design a small application that executes some tasks using PowerShell scripts and return the data to the client browser. However, I would have to execute a Timer on the client browser (when someone clicks on a button) to see if the file, that Timer receives from the server, has "ENDOFDATA" or not. Once the Timer sees ENDOFDATA it triggers another function to populate DIV with the data that was received from the server.
Is this the right way to get the data from a server? We really don't want to block EventLoop. We run PowerShell scripts on NodeJS to collect users from Active Directory and then send the data back to the client browser. The PowerShell scripts are executed as a Job so EventLoop is not blocked.
Here is an example of the code at NodeJs:
In the below code can we insert something that won't block the EventLoop but still respond to the server once the task is completed? As you can see in the code below, we would like to send the ADUsers.CSV file to the client browser once GetUsers.PS1 has finished executing. Since GetUSers.PS1 takes about five minutes to complete the Event Loop is blocked and the Server can no longer accept any other requests.
app.post("/LoadDomUsers", (request, response) => {
//we check if the request is an AJAX one and if accepts JSON
if (request.xhr || request.accepts("json, html") === "json") {
var ThisAD = request.body.ThisAD
console.log(ThisAD);
ps.addCommand("./public/ps/GetUsers.PS1", [{
name: 'AllParaNow',
value: ScriptPara
}])
ps.addCommand(`$rc = gc ` + __dirname + "/public/TestData/AD/ADUsers.CSV");
ps.addCommand(`$rc`);
ps.invoke().then((output) => {
response.send({ message: output });
console.log(output);
});
}
});
Thank you.
The way you describe your problem isn't that clear. I had to read some of the comments in your initial question just to be sure I understood the issue. Honestly, you could just utilize various CSV NPM packages to read and write from your active directory with NodeJS.
I/O is non-blocking with NodeJS, so you're not actually blocking the EventLoop. You can handle multiple I/O requests, since NodeJS will just create threads for each one,
and continue execution on the main thread until the I/O operations complete and send back the data to its function reference, adding them as functions to the callstack and resuming program execution from those function's references. After you get the I/O data, you just send it back to the client through the response object. There should be no timers needed.
So is the issue once the powershell script runs, you have to wait for that initial script to complete before being able to handle pending requests? I'm still a bit unclear...
Does node js execute multiple commands in parallel or execute one command (and finish it!) and then execute the second command?
For example if multiple async functions use the same Stack, and they push & pop "together", can I get strange behaviour?
Node.js runs your main Javascript (excluding manually created Worker Threads for now) as a single thread. So, it's only ever executing one piece of your Javascript at a time.
But, when a server request contains asynchronous operations, what happens in that request handle is that it starts the asynchronous operation and then returns control back to the interpreter. The asynchronous operation runs on its own (usually in native code). While all that is happening, the JS interpreter is free to go back to the event loop and pick up the next event waiting to be run. If that's another incoming request for your server, it will grab that request and start running it. When it hits an asynchronous operation and returns back to the interpreter, the interpreter then goes back to the event loop for the next event waiting to run. That could either be another incoming request or it could be one of the previous asynchronous operations that is now ready to run it's callback.
In this way, node.js makes forward progress on multiple requests at a time that involve asynchronous operations (such as networking, database requests, file system operations, etc...) while only ever running one piece of your Javascript at a time.
Starting with node v10.5, nodejs has Worker Threads. These are not automatically used by the system yet in normal service of networking requests, but you can create your own Worker Threads and run some amount of Javascript in a truly parallel thread. This probably isn't need for code that is primarily I/O bound because the asynchronous nature of I/O in Javascript already gives it plenty of parallelism. But, if you had CPU-intensive operations (heavy crypto, image analysis, video compression, etc... that was done in Javascript), Worker Threads may definitely be worth adding for those particular tasks.
To show you an example, let's look at two request handlers, one that reads a file from disk and one that fetches some data from a network endpoint.
app.get("/getFileData", (req, res) => {
fs.readFile("myFile.html", function(err, html) {
if (err) {
console.log(err);
res.sendStatus(500);
} else {
res.type('html').send(html);
}
})
});
app.get("/getNetworkData", (req, res) => {
got("http://somesite.com/somepath").then(result => {
res.json(result);
}).catch(err => {
console.log(err);
res.sendStatus(500);
});
});
In the /getFileData request, here's the sequence of events:
Client sends request for http://somesite.com/getFileData
Incoming network event is processed by the OS
When node.js gets to the event loop, it sees an event for an incoming TCP connection on the port its http server is listening for and calls a callback to process that request
The http library in node.js gets that request, parses it, and notifies the observes of that request, once of which will be the Express framework
The Express framework matches up that request with the above request handler and calls the request handler
That request handler starts to execute and calls fs.readFile("myfile.html", ...). Because that is asynchronous, calling the function just initiates the process (carrying out the first steps), registers its completion callback and then it immediately returns.
At this point, you can see from that /getFileData request handler that after it calls fs.readFile(), the request handler just returns. Until the callback is called, it has nothing else to do.
This returns control back to the nodejs event loop where nodejs can pick out the next event waiting to run and execute it.
In the /getNetworkData request, here's the sequence of events
Steps 1-5 are the same as above.
6. The request handler starts to execute and calls got("http://somesite.com/somepath"). That initiates a request to that endpoint and then immediately returns a promise. Then, the .then() and .catch() handlers are registered to monitor that promise.
7. At this point, you can see from that /getNetworkData request handler that after it calls got().then().catch(), the request handler just returns. Until the promise is resolved or rejected, it has nothing else to do.
8. This returns control back to the nodejs event loop where nodejs can pick out the next event waiting to run and execute it.
Now, sometime in the future, fs.readFile("myFile.html", ...) completes. At this point, some internal sub-system (that may use other native code threads) inserts a completion event in the node.js event loop.
When node.js gets back to the event loop, it will see that event and run the completion callback associated with the fs.readFile() operation. That will trigger the rest of the logic in that request handler to run.
Then, sometime in the future the network request from got("http://somesite.com/somepath") will complete and that will trigger an event in the event loop to call the completion callback for that network operation. That callback will resolve or reject the promise which will trigger the .then() or .catch() callbacks to be called and the second request will execute the rest of its logic.
Hopefully, you can see from these examples how request handlers initiate an asynchronous operation, then return control back to the interpreter where the interpreter can then pull the next event from the event loop and run it. Then, as asynchronous operations complete, other things are inserted into the event loop causing further progress to run on each request handler until eventually they are done with their work. So, multiple sections of code can be making progress without more than one piece of code every running at the same time. It's essentially cooperative multi-tasking where the time slicing between operations occurs at the boundaries of asynchronous operations, rather than an automatic pre-emptive time slicing in a fully threaded system.
Nodejs gets a number of advantages from this type of multi-tasking as it's a lot, lot lower overhead (cooperative task switching is a lot more efficient than time-sliced automatic task switching) and it also doesn't have most of the usual thread synchronization issues that true multi-threaded systems do which can make them a lot more complicated to code and/or more prone to difficult bugs.
I'm creating an nodejs web processor. I's is processing time that takes ~ 1 minute. I POST to my server and get status by using GET
this is my simplified code
// Configure Express
const app = express();
app.listen(8080);
// Console
app.post('/clean, async function(req, res, next) {
// start proccess
let result = await worker.process(data);
// Send result when finish
res.send(result);
});
// reply with when asked
app.get('/clean, async function(req, res, next) {
res.send(worker.status);
});
The problem is. The server is working so hard in the POST /clean process that GET /clean are not replied in time.
All GET /clean requests are replied after the worker finishes its task and free the processor to respond the request.
In other words. The application are unable to respond during workload.
How can I get around this situation?
Because node.js runs your Javascript as single threaded (only one piece of Javascript ever running at once) and does not time slice, as long as your worker.process() is running it's synchronous code, no other requests can be processed by your server. This is why worker.process() has to finish before any of the http requests that arrived while it was running get serviced. The node.js event loop is busy until worker.process() is done so it can't service any other events (like incoming http requests).
These are some of the ways to work around that:
Cluster your app with the built-in cluster module so that you have a bunch of processes that can either work on worker.process() code or handle incoming http requests.
When it's time to call worker.process(), fire up a new node.js process, run the processing there and communicate back the result with standard interprocess communication. Then, your main node.js process stays reading to handle incoming http requests near instantly as they arrive.
Create a work queue of a group of additional node.js processes that run jobs that are put in the queue and configure these processes to be able to run your worker.process() code from the queue. This is a variation of #2 that bounds the number of processes and serializes the work into a queue (better controlled than #2).
Rework the way worker.process() does its work so that it can do a few ms of work at a time, then return back to the message loop so other events can run (like incoming http requests) and then resume it's work afterwards for a few more ms at a time. This usually requires building some sort of stateful object that can do a little bit of work at a time each time it is called, but is often a pain to program effectively.
Note that #1, #2 and #3 all require that the work be done in other processes. That means that the process.status() will need to get the status from those other processes. So, you will either need some sort of interprocess way of communicating with the other processes or you will need to store the status as you go in some storage that is accessible from all processes (such as redis) so it can just be retrieved from there.
There's no working around the single-threaded nature of JS short of converting your service to a cluster of processes or to use something experimental like Worker Threads.
If neither of these options work for you, you'll need to yield up the processing thread periodically to give other tasks the ability to work on things:
function workPart1() {
// Do a bunch of stuff
setTimeout(workPart2, 10);
}
function workPart2() {
// More stuff
setTimeout(workPart3, 10); // etc.
}
I am new to Node and I am writing my very first node server. It should answer to a simple get request with a simple page after calling a backend rest service.
I am using express to manage the request and the axios package to make the backend request. The problem is that the server is blocking the event loop and I have problems understanding how to make the call to the backend asynchronous.
As of now the frontend server can only manage one request at a time!! I expected that if the backend service takes 10 seconds to answer everytime, the frontend server can answer two concurrent request in 10 seconds and not in 20 seconds.
Where am I wrong?
Here is an extract of the frontend node code:
app.get('/', function(req, res) {
//Making the call to the backend service. This should be asynchronous...
axios.post(env.get("BACKEND_SERVICE"),
{ "user": "some kind of input"})
.then(function(response){
//do somenthing with the data returned from the backend...
res.render('homepage');
})
}
And here it is and extract of the backend node code:
app.post('/api/getTypes', jsonParser, function (req, res) {
console.log("> API request for 'api/getTypes' SLEEP");
var now = new Date().getTime();
while(new Date().getTime() < now + 10000){ /* do nothing */ }
console.log("> API request for 'api/getTypes' WAKE-UP");
res.json({"types":"1"});
}
The problem is your busy-wait ties up the backend server such that it can't even begin to process the second request.
I assume you're trying to simulate the process of getting the types taking a while. Odds are what you're going to be doing to get the types will be async and I/O-bound (reading files, querying a database, etc.). To simulate that, just use setTimeout:
app.post('/api/getTypes', jsonParser, function (req, res) {
console.log("> API request for 'api/getTypes' SLEEP");
setTimeout(function() {
console.log("> API request for 'api/getTypes' WAKE-UP");
res.json({"types":"1"});
}, 10000);
});
That avoids hogging the backend server's only thread, leaving it free to start overlapping handling for the second (third, fourth, ...) request.
This is one of the key principles of Node: Don't do things synchronously if you an avoid it. :-) That's why the API is so async-oriented.
If you do find at some point that you have heavy CPU-burning crunching you need to do to process a request, you might spin it off as a child process of the server rather than doing it in the server process. Node is single-threaded by design, achieving very high throughput via an emphasis on asynchronous I/O. Which works great for most of what you need to do...until it doesn't. :-)
Re your comment:
The backend process will be written in another technology other than node, it will call a DB and it could take a while. I wrote that simple node rest service to simulate that. What I would like to understand is how the frontend server will react if the backend takes time to process the requests.
There's a big difference between taking time to process the requests and tying up the only server thread busy-waiting (or doing massive CPU-heavy work). Your busy-wait models doing massive CPU-heavy work, but if getting the types is going to be external to Node, you won't be busy-waiting on it, you'll be queuing a callback for an asynchronus completion (waiting for I/O from a child process, or I/O from a socket connected to a third server process, or waiting on I/O from the DB, etc.). So the setTimeout example above is a better model for what you'll really be doing.
The busy-wait keeps the front-end from completing because it goes like this:
Backend
Time Frontend Queue Backend
−−−− −−−−−−−−−− −−−−−−−−−−−−−−−−−−−−− −−−−−−−−−−−−−−−−−−−−−
0 sec Request #1 −−−−−−> Receive request #1 −−−−−> Pick up job for request #1
0 sec Request #1 −−−−−−> Receive request #2
Busy wait 10 seconds
10 sec Got #1 back <−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− Send response #1
−−−−−> Pick up job for request #2
Busy wait 10 seconds
20 sec Got #2 back <−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− Send response #2
So even though the front-end isn't busy-waiting, it sees 20 seconds go by because the backend busy-waits (unable to do anything else) for 10 seconds for each request.
But that's not what your real setup will do, unless the other technology you're using is also single-threaded. (If it is, you may want to have more than one of them run in parallel.)
Consider this simple example:
var BinaryServer = require('../../').BinaryServer;
var fs = require('fs');
// Start Binary.js server
var server = BinaryServer({port: 9000});
// Wait for new user connections
server.on('connection', function(client){
// Stream a flower image!
var file = fs.createReadStream(__dirname + '/flower.png');
client.send(file);
sleep_routine(5);//in seconds
});
When a client connects to the server I block the event for about 5 seconds (imagine that time has some complex operations). What is expect to happen if another client connects (meanwhile)? One thing that I read about NodeJS is non-blocking I/O. But in this case the second client only receive the flower after the sleeping of the first, right?
One thing that I read about NodeJS is non-blocking I/O. But in this case the second client only receive the flower after the sleeping of the first, right?
That's correct, assuming that you are doing blocking synchronous operations for five seconds straight. If you do any file system IO, or any IO for that matter, or use a setTimeout, then the other client will get their opportunity to use the thread and get the flower image. So, if you're doing really heavy cpu intensive processing, you have a few choices:
Fire it off in a separate process that runs asynchronously, EG using the built-in child_process module
Keep track of how long you've been processing for and every 100ms or so give up the thread by saving your state, and then using setTimeout to continue processing where you left off
Have multiple node processes already running, so that if one is busy there is another that can serve the second user (EG. behind a load balancer, or using the cluster module)
I would recommend a combination of 1 and 3 if this is ever a problem; but so much of node can be made asynchronous that it rarely is. Even things like computing password hashes can be done asynchronously
No - the two requests will be handled independently. So if the first request had to wait 5 seconds, and for some reason the second request only took 2 seconds, the second would return before the first.
In practice you would have your server connection called a second time before the first one had finished. But since they have all different state you would normally be unaware of that.