I have a large array(say over 1 000 000 elements that I would like to sort asynchronously so that it doesn't block the execution of the rest of my program.
I'm fairly new to JavaScript, so I was wondering if this would work.
var sortFunction = function(arr){
return new Promise(resolve, reject){
arr.sort();
resolve(arr);
}
}
sortFunction(hugeArray).then(function(arr){
//do something
})
This is actually possible, but experimental. You need to offload the work to a separate thread. You need to use the sparsely supported but useful SharedArrayBuffer and offload the work to a web worker.
You have to use both the web worker and the SharedArrayBuffer, only using a web worker won't help you because serializing will be too expensive. It has to be a zero-cost copy operation
Here is an example gist on how to perform this.
Related
How do the NodeJS built in functions achieve their asynchronicity?
Am I able to write my own custom asynchronous functions that execute outside of the main thread? Or do I have to leverage the built in functions?
Just a side note, true asynchronous doesn't really mean anything. But we can assume you mean parallelism?.
Now depending on what your doing, you might find there is little to no benefit in using threads in node. Take for example: nodes file system, as long as you don't use the sync versions, it's going to automatically run multiple requests in parallel, because node is just going to pass these requests to worker threads.
It's the reason when people say Node is single threaded, it's actually incorrect, it's just the JS engine that is. You can even prove this by looking at the number of threads a nodeJs process takes using your process monitor of choice.
So then you might ask, so why do we have worker threads in node?. Well the V8 JS engine that node uses is pretty fast these days, so lets say you wanted to calculate PI to a million digits using JS, you could do this in the main thread without blocking. But it would be a shame not to use those extra CPU cores that modern PC's have and keep the main thread doing other things while PI is been calculated inside another thread.
So what about File IO in node, would this benefit been in a worker thread?.. Well this depends on what you do with the result of the file-io, if you was just reading and then writing blocks of data, then no there would be no benefit, but if say you was reading a file and then doing some heavy calculations on these files with Javascript (eg. some custom image compression etc), then again a worker thread would help.
So in a nutshell, worker threads are great when you need to use Javascript for some heavy calculations, using them for just simple IO may in fact slow things down, due to IPC overheads.
You don't mention in your question what your trying to run in parallel, so it's hard to say if doing so would be of benefit.
Javascript is mono-thread, if you want to create 'thread' you can use https://nodejs.org/api/worker_threads.html.
But you may have heard about async function and promises in javascript, async function return a promise by default and promise are NOT thread. You can create async function like this :
async function toto() {
return 0;
}
toto().then((d) => console.log(d));
console.log('hello');
Here you will display hello then 0
but remember that even the .then() will be executed after it's a promise so that not running in parallel, it will just be executed later.
I have a question regarding this topic:
bcrypt.compare() is asynchronous, does that necessarily mean that delays are certain to happen?
Since I'm not allowed to put comments because of my membership level I had to open new topic.
My question is what are the downsides or is there any for using bcrypt.compareSync() instead of the async version of bcrypt.compare().
compareSync() definitely gives the correct result. So why not use it and use the compare() wrapped in Promises? Is it going to halt the nodeJS from serving other users?
The reason to use the async methods instead of the sync ones are explained in the readme of the project quite well.
Why is async mode recommended over sync mode?
If you are using bcrypt on a simple script, using the sync mode is perfectly fine. However, if you are using bcrypt on a server, the async mode is recommended. This is because the hashing done by bcrypt is CPU intensive, so the sync version will block the event loop and prevent your application from servicing any other inbound requests or events. The async version uses a thread pool which does not block the main event loop.
https://github.com/kelektiv/node.bcrypt.js#why-is-async-mode-recommended-over-sync-mode
So if you are using this in a webapplication or other environment where you don't want to block the main thread you should use the async version.
Node.js native methods have Sync attached methods like fs.writeFileSync, crypto.hkdfSync, child_process.execSync. JavaScript in the browser is implemented asynchronously with all native functions that require thread blocking, but Sync methods in Node.js actually block threads until the task is complete.
When using Callback or Promise in Node.js, if only asynchronous logic is executed internally, it becomes possible to manage asynchronous tasks while proceeding with other tasks without stopping the main thread (using count for Callbak, Promise.all).
Sync method runs the next line after work, so it is easy to identify the order of execution and easy to code. However, the main thread is blocked, so you can't do more than one task at a time.
Think about the next example.
const syncFunc = () => {
for (let i = 0; i < 100; i++) fs.readFileSync(`/files/${i}.txt`);
console.log('sync done');
};
const promiseFunc = async () => {
await Promise.all(Array.from({length: 100}, (_,i) => fs.promises.readFile(`/files/${i}.txt`)));
console.log('promise done');
};
The promise function ends much faster when there is no problem reading all 100 txt files.
This Sync feature applies equally to libraries made of C language. If you look at the following code, you can see the difference in implementation in C++.
compare
compareSync
In conclusion, I think it's a matter of choice. There is no problem using Sync method if the code you make is logic that goes on a single thread that doesn't matter if the main thread is blocked(like simple macro). However, if you are making logic where performance issues such as servers are important and the main thread should not stop as much as possible for thread or asynchronous management, you can choose Promise or Callback.
I am trying to read multiple JSON files simultaneously and create a single array using the data available in the files and do some processing with the created data array in the Node.js server.
I would like to read these files and do the processing tasks simultaneously using web workers.
I read a few interesting tutorials and articles about the subject, but no one clearly explains how to process simultaneous tasks using web workers.
They talk about running a single separated task from the main thread. But I need to do multiple tasks at once.
I also know that creating multiple workers is not recommended according to the documentation of Node.js.
Maybe I have a misunderstanding of how the web worker is functioning or with the implementation in order to perform multiple tasks.
I also tried this great library Thread.js - https://threads.js.org/ still the documentation is unclear about running multiple tasks.
Can anyone please explain what is the way of implementing this kind of work with best practice along with the pros and cons?
I would prefer implementing the vanilla JS solution other than using a library so the explanation would also be a reference to readers.
Also if possible someone can explain the usage of the Thread.js library as well for future reference.
Thank you very much.
As I'm sure you have read, the node is single-threaded, so running transactions in parallel is not going to work, even with worker threads as they are not designed to run in parallel.
A worker thread is more for longer, more process intense functions that you want to pass off and not block the main event loop, so if you think of it in terms of uploading and processing an image.. well we don't really want to hang up the entire event loop while the image is processed, so we can pass it off to a worker thread and it will tell the event loop when it's done, and it will return the response.
I think what you may be looking to do is just create a promise, so you would have a promise and say an array of the JSON file name like ["file1.JSON", "file2.JSON"] Then in your promise you would loop over, read the contents and 'return' the JSON object, insert or concat the main array variable.
Once the promise resolves, you would use the
.then(()=>{ //Do you processing of the full array })
Here's an example with a library (node-worker-threads-pool).
Thread/worker management is a complex endeavor, and I would not recommend trying to have some generic solution. Even the library I'm suggesting may not be correct.
// sample.js
const { StaticPool } = require('node-worker-threads-pool');
const start = async function () {
const staticPool = new StaticPool({
size: 4,
task: async function(n) {
const sleep = async function (ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
console.log(`thread ${n} started`);
await sleep(1000 * n);
return n + 1
}
});
// start 4 workers, each will run asynchronously and take a longer time to finish
for (let index = 0; index < 4; index++) {
staticPool.exec(index)
.then((result) => {
console.log(`result from thread pool for thread ${index}: ${result}`);
})
.catch((err) => console.error(`Error: ${err}`));
}
}
start();
I ran this in npm using node sample.js
As discussed in the other answer, it may not be useful (in terms of performance) to do this, but this example shows how it can be done.
The library also has examples where you give the tasks specific work.
It's a very general question, but I don't quite understand. When would I prefer one over the other? I don't seem to understand what situations might arise, which would clearly favour one over the other. Are there strong reasons to avoid x / use x?
When would I prefer one over the other?
In a server intended to scale and serve the needs of many users, you would only use synchronous I/O during server initialization. In fact, require() itself uses synchronous I/O. In all other parts of your server that handle incoming requests once the server is already up and running, you would only use asynchronous I/O.
There are other uses for node.js besides creating a server. For example, suppose you want to create a script that will parse through a giant file and look for certain words or phrases. And, this script is designed to run by itself to process one file and it has no persistent server functionality and it has no particular reason to do I/O from multiple sources at once. In that case, it's perfectly OK to use synchronous I/O. For example, I created a node.js script that helps me age backup files (removing backup files that meet some particular age criteria) and my computer automatically runs that script once a day. There was no reason to use asynchronous I/O for that type of use so I used synchronous I/O and it made the code simpler to write.
I don't seem to understand what situations might arise, which would clearly favour one over the other. Are there strong reasons to avoid x / use x?
Avoid ever using synchronous I/O in the request handlers of a server. Because of the single threaded nature of Javascript in node.js, using synchronous I/O blocks the node.js Javascript thread so it can only do one thing at a time (which is death for a multi-user server) whereas asynchronous I/O does not block the node.js Javascript thread (allowing it to potentially serve the needs of many users).
In non-multi-user situations (code that is only doing one thing for one user), synchronous I/O may be favored because writing the code is easier and there may be no advantages to using asynchronous I/O.
I thought of an electron application with nodejs, which is simply reading a file and did not understand what difference that would make really, if my software really just has to wait for that file to load anyways.
If this is a single user application and there's nothing else for your application to be doing while waiting for the file to be read into memory (no sockets to be responding to, no screen updates, no other requests to be working on, no other file operations to be running in parallel), then there is no advantage to using asynchronous I/O so synchronous I/O will be just fine and likely a bit simpler to code.
When would I prefer one over the other?
Use the non-Sync versions (the async ones) unless there's literally nothing else you need your program to do while the I/O is pending, in which case the Sync ones are fine; see below for details...
Are there strong reasons to avoid x / use x?
Yes. NodeJS runs your JavaScript code on a single thread. If you use the Sync version of an I/O function, that thread is blocked waiting on I/O and can't do anything else. If you use the async version, the I/O can continue in the background while the JavaScript thread gets on with other work; the I/O completion will be queued as a job for the JavaScript thread to come back to later.
If you're running a foreground Node app that doesn't need to do anything else while the I/O is pending, you're probably fine using Sync calls. But if you're using Node for processing multiple things at once (like web requests), best to use the async versions.
In a comment you added under the question you've said:
I thought of an electron application with nodejs, which is simply reading a file and did not understand what difference that would make really, if my software really just has to wait for that file to load anyways.
I have virtually no knowledge of Electron, but I note that it uses a "main" process to manage windows and then a "rendering" process per window (link). That being the case, using Sync functions will block the relevant process, which may affect application or window responsiveness. But I don't have any deep knowledge of Electron (more's the pity).
Until somewhat recently, using async functions meant using lots of callback-heavy code which was hard to compose:
// (Obviously this is just an example, you wouldn't actually read and write a file this way, you'd use streaming...)
fs.readFile("file1.txt", function(err, data) {
if (err) {
// Do something about the error...
} else {
fs.writeFile("file2.txt", data, function(err) {
if (err) {
// Do something about the error...
} else {
// All good
});
}
});
Then promises came along and if you used a promisified* version of the operation (shown here with pseudonyms like fs.promisifiedXYZ), it still involved callbacks, but they were more composable:
// (See earlier caveat, just an example)
fs.promisifiedReadFile("file1.txt")
.then(function(data) {
return fs.promisifiedWriteFile("file2.txt", data);
})
.then(function() {
// All good
})
.catch(function(err) {
// Do something about the error...
});
Now, in recent versions of Node, you can use the ES2017+ async/await syntax to write synchronous-looking code that is, in fact, asynchronous:
// (See earlier caveat, just an example)
(async () => {
try {
const data = await fs.promisifiedReadFile("file1.txt");
fs.promisifiedWriteFile("file2.txt", data);
// All good
} catch (err) {
// Do something about the error...
}
})();
Node's API predates promises and has its own conventions. There are various libraries out there to help you "promisify" a Node-style callback API so that it uses promises instead. One is promisify but there are others.
So I started a little project in Node.js to learn a bit about it. It's a simple caching proxy for arch linux's package system as node provides most of the heavy lifting.
This has two "main" phases, server setup and serving.
Then serving has two main phases, response setup and response.
The "main" setup involves checking some files, loading some config from files. loading some json from a web address. Then launching the http server and proxy instance with this info.
setup logger/options - read config - read mirrors - read webmirror
start serving
Serving involves checking the request to see if the file exists, creating directories if needed, then providing a response.
check request - check dir - check file
proxy request or serve file
I keep referring to them as synchronisation points but searches don't lead to many results. Points where a set of async tasks have to be finished before the process can complete a next step. Perl's AnyEvent has conditional variables which I guess is what I'm trying to do, without the blocking.
To start with I found I was "cheating" and using the synchronous versions of any functions where provided but that had to stop with the web requests, so I started restructuring things. Immediately most search's led to using async or step to control the flow. To start with I was trying lots of series/parallel setups but running into issues if there were any async calls underneath the functions would "complete" straight away and the series would finish.
After much wailing and gnashing of teeth, I ended up with a "waiter" function using async.until that tests for some program state to be set by all the tasks finishing before launching the next function.
// wait for "test" to be true, execute "run",
// bail after "count" tries, waiting "sleep" ms between tries;
function waiter( test, run, count, sleep, message ) {
var i=0;
async.until(
function () {
if ( i > count ) { return true; }
logger.debug('waiting for',message, test() );
return test();
},
function (callback) {
i++;
setTimeout(callback, sleep );
},
function (err) {
if ( i > count ) {
logger.error('timeout for', message, count*sleep );
return;
}
run()
}
);
}
It struck me as being rather large and ugly and requiring a module to implement for something that I thought was standard, so I am wondering what's a better way. Am I still thinking in a non-async way? Is there something simple in Node I have overlooked? Is there a standard way of doing this?
I imagine with this setup, if the program get's complex there's going to be a lot of nesting functions to describe the flow of the program and I'm struggling to see a good way to lay it all out.
any tips would be appreciated.
You can't really make everything to be synchronous. Nodejs is designed to perform asynchronously (which may of course torment you at times). But there are a few ways techniques to make it work in a synchronous way (provided the pseudo-code is well-thought and code is designed carefully):
Using callbacks
Using events
Using promises
Callbacks and events are easy to use and understand. But with these, sometimes the code can get real messy and hard to debug.
But with promises, you can avoid all that. You can make dependency chains, called 'promises' (for instance, perform Promise B only when Promise A is complete).
Earlier versions of node.js had implementation of promises. They promised to do some work and then had separate callbacks that would be executed for success and failure as well as handling timeouts.
But in later versions, that was removed. By removing them from the core node.js, it created possibility of building up modules with different implementations of promises that can sit on top of the core. Some of these are node-promise, futures, and promises.
See these links for more info:
Framework
Promises and Futures
Deferred Promise - jQuery