How do I keep Promises.all from hanging my process? - javascript

When I await Promises.all, my unit-test process (npm with Mocha) never exits, but when I await individual promises, it does.
async function f(){...}
async function getVals1() {
const vals = await Promise.all([f(), f()]);
return vals;
}
async function getVals2() {
const vals = []
vals.push(await f());
vals.push(await f());
return vals;
}
Important points:
The values are all sccessfully returned from await Promises.all. They are the expected values (i.e., not themselves Promises), the same as with individual awaits. Everything works fine, but the npm does not exit.
The problem does not always occur, but I don't know what, if anything, conditions the problem.
The application uses MongoDB and non-closure of MongoDB connections can cause applications to hang. But I did check that MongoDB connections are closed (at the end of the test run, long after this await call; and in any case the difference between await Promise.all and individual await is hard to explain.

This was caused by multiple opening of MongoDB clients, so that afterwards, while some were closed, some were not. A lock was needed around mongoDbClient = await MongoClient.connect(...).

It seems the problem is that one or more of your functions is not asynchronous. But without code it hard to understand

Related

Javascript: Need help understanding async/await with mongooseODM

const deleteTaskCount = async(id) => {
const task = Task.findByIdAndDelete(id);
const count = await Task.countDocument({completed:false})
return count;
}
deleteTaskCount("1234").then((count)=>{
console.log(count);
})
I understood that with await in const task = Task.findByIdAndDelete(id); it deletes the user with the id.But without the await keyword(like in the above function) the function still happens, that is it deletes the user but in a non-blocking way after some time.When i ran the above code the count is showing properly but the const task = Task.findByIdAndDelete(id); is not deleting the user.
You should use await for the Task.findByIdAndDelete(id) as well. When you use await, NodeJS will go do some other things until that resolves, and it will only then continue to execute the function (it will not block your code at all).
const deleteTaskCount = async(id) => {
const task = await Task.findByIdAndDelete(id);
const count = await Task.countDocument({completed:false})
return count;
}
deleteTaskCount("1234").then((count)=>{
console.log(count);
})
I don't know much about the mongoose API but I suspect mongoose.query behaves similar to mongo cursors and many other asynchronous database drivers/ORMs. Queries often don't execute unless awaited or chained upon.
In any case, without awaiting Task.findByIdAndDelete, there's no guarantee in concurrency between the two queries. If you want the result of countDocument to reflect the changes in findByIdAndDelete, you must wait on the client-side for that query to complete and then run countDocument.
You would achieve this by adding the await keyword before Task.findByIdAndDelete.
I'd presonally recommend using the promises/thenables as opposed to await/async.
Task.findByIdAndDelete("1234")
.then(task => Task.countDocument({completed:false}))
.then(count => console.log(count));
Imho, it's cleaner, less error prone, more functional, less confusing. YMMV

async resolve() needs to be wrapped?

Why do I need to wrap resolve() with meaningless async function in node 10.16.0, but not in chrome? Is this node.js bug?
let shoot = async () => console.log('there shouldn\'t be race condition');
(async () => {
let c = 3;
while(c--) {
// Works also in node 10.16.0
// console.log(await new Promise(resolve => shoot = async (...args) => resolve(...args)));
// Works is chrome, but not in node 10.16.0?
console.log(await new Promise(resolve => shoot = resolve));
};
})();
(async () => {
await shoot(1);
await shoot(2);
await shoot(3);
})();
resolve() is not async
And calling resolve() (via shoot()) does not immediately triggers the related await (in the loop) - but instead queues up the event. Adding async/await gives chance to the event loop to wake up and consume the queue. In chrome await alone is enough and in node await needs to be coupled with actual async function. This kind of synchronization of tasks in not reliable and there are chances of calling the same resolve() twice.
This is an example of what NOT to do in javascript.
It’s a Node 10 (possible) bug or (probable) outdated behaviour* in the implementation of promises. According to the ECMAScript spec at the time of this writing, in
await shoot(1);
shoot(1) fulfills the promise created with new Promise(), which enqueues a job for each reaction in that promise’s fulfill reactions
await undefined (what shoot(1) returns) enqueues a job to continue after this statement, because undefined is converted to a fulfilled promise
The reaction in the promise’s fulfill reactions corresponding to the await in the first IIFE was added by PerformPromiseThen and it doesn’t involve any other jobs; it just continues inside that IIFE immediately.
In short, the next shoot = resolve should always run before execution continues after await shoot(n). The Node 12/current Chrome result is correct.
Normally, you shouldn’t come across this type of bug anyway: as I mentioned in the comments, relying on operations creating specific numbers of jobs/taking specific numbers of microticks for synchronization is bad design. If you wanted a sort of stream where each shoot() call always produces a loop iteration (even without the misleading await), something like this would be better:
let available;
(async () => {
let queue = new Queue();
while (true) {
await new Promise(resolve => {
available = value => {
resolve();
queue.enqueue(value);
};
});
available = null; // just an assertion, pretty much
while (!queue.isEmpty()) {
let value = queue.dequeue();
// process value
}
}
})();
shoot(1);
shoot(2);
shoot(3);
with an appropriate queue implementation. (Then you could look to async iterators to make consuming the queue neat.)
* not sure of the exact history here. fairly certain the ES spec used to reference microtasks, but they’re jobs now. current stable firefox matches node 10. await may take less time than it used to. this kind of thing is the reason for the following advice.
Your code is relying on a somewhat obscure timing issue involving await of a constant (a non-promise). That timing issue apparently does not behave the same in the two environments you have tested.
Each await shoot(n) is really just doing await resolve(n) which is not awaiting a promise. It's awaiting undefined since resolve() has no return value.
So, you're apparently seeing an implementation difference in event loop and promise implementation when you await a non-promise. You are apparently expecting await resolve() to somehow be asynchronous and allow your while() loop to run before running the next await shoot(n), but I'm not aware of a language requirement to do that and even if there is, it's an implementation detail that you probably should not write code that relies on.
I think it's basically just bad code design that relies on micro-details of scheduling of two jobs that are enqueued at about the same time. It's always safer to write the code in a way that enforces the proper sequencing rather than relying on micro-details of scheduler implementation - even if these details are in the specification and certainly if they are not.
node.js is perhaps more optimized or buggy (I don't know which) to not go back to the event loop when doing await on a constant. Or, if it does go back to the event loop, it goes in a prioritized fashion that keeps the current chain of code executing rather than letting other promises go next. In any case, for this code to work, it has to rely on some await someConstant behavior that isn't the same everywhere.
Wrapping resolve() forces the interpreter to go back to the event loop after each await shoot(n) because it is actually now awaiting a promise which gives the while() loop a chance to run and fill shoot with a new value before the next shoot(n) is called.

Difference between Serial and parallel

I am fetching some persons from an API and then executing them in parallel, how would the code look if i did it in serial?
Not even sure, if below is in parallel, i'm having a hard time figuring out the difference between the two of them.
I guess serial is one by one, and parallel (promise.all) is waiting for all promises to be resolved before it puts the value in finalResult?
Is this understood correct?
Below is a snippet of my code.
Thanks in advance.
const fetch = require('node-fetch')
const URL = "https://swapi.co/api/people/";
async function fetchPerson(url){
const result = await fetch(url);
const data = await result.json().then((data)=>data);
return data ;
}
async function printNames() {
const person1 = await fetchPerson(URL+1);
const person2 = await fetchPerson(URL+2);
let finalResult =await Promise.all([person1.name, person2.name]);
console.log(finalResult);
}
printNames().catch((e)=>{
console.log('There was an error :', e)
});
Let me translate this code into a few different versions. First, the original version, but with extra work removed:
const fetch = require('node-fetch')
const URL = "https://swapi.co/api/people/";
async function fetchPerson(url){
const result = await fetch(url);
return await result.json();
}
async function printNames() {
const person1 = await fetchPerson(URL+1);
const person2 = await fetchPerson(URL+2);
console.log([person1.name, person2.name]);
}
try {
await printNames();
} catch(error) {
console.error(error);
}
The code above is equivalent to the original code you posted. Now to get a better understanding of what's going on here, let's translate this to the exact same code pre-async/await.
const fetch = require('node-fetch')
const URL = "https://swapi.co/api/people/";
function fetchPerson(url){
return fetch(url).then((result) => {
return result.json();
});
}
function printNames() {
let results = [];
return fetchPerson(URL+1).then((person1) => {
results.push(person1.name);
return fetchPerson(URL+2);
}).then((person2) => {
results.push(person2.name);
console.log(results);
});
}
printNames().catch((error) => {
console.error(error);
});
The code above is equivalent to the original code you posted, I just did the extra work the JS translator will do. I feel like this makes it a little more clear. Following the code above, we will do the following:
Call printNames()
printNames() will request person1 and wait for the response.
printNames() will request person2 and wait for the response.
printNames() will print the results
As you can imagine, this can be improved. Let's request both persons at the same time. We can do that with the following code
const fetch = require('node-fetch')
const URL = "https://swapi.co/api/people/";
function fetchPerson(url){
return fetch(url).then((result) => {
return result.json();
});
}
function printNames() {
return Promise.all([fetchPerson(URL+1).then((person1) => person1.name), fetchPerson(URL+2).then((person2) => person2.name)]).then((results) => {
console.log(results);
});
}
printNames().catch((error) => {
console.error(error);
});
This code is NOT equivalent to the original code posted. Instead of performing everything serially, we are now fetching the different users in parallel. Now our code does the following
Call printNames()
printNames() will
Send a request for person1
Send a request for person2
printNames() will wait for a response from both of the requests
printNames() will print the results
The moral of the story is that async/await is not a replacement for Promises in all situations, it is syntactic sugar to make a very specific way of handling Promises easier. If you want/can perform tasks in parallel, don't use async/await on each individual Promise. You can instead do something like the following:
const fetch = require('node-fetch')
const URL = "https://swapi.co/api/people/";
async function fetchPerson(url){
const result = await fetch(url);
return result.json();
}
async function printNames() {
const [ person1, person2 ] = await Promise.all([
fetchPerson(URL+1),
fetchPerson(URL+2),
]);
console.log([person1.name, person2.name]);
}
printNames().catch((error) => {
console.error(error);
});
Disclaimer
printNames() (and everything else) doesn't really wait for anything. It will continue executing any and all code that comes after the I/O that does not appear in a callback for the I/O. await simply creates a callback of the remaining code that is called when the I/O has finished. For example, the following code snippet will not do what you expect.
const results = Promise.all([promise1, promise2]);
console.log(results); // Gets executed immediately, so the Promise returned by Promise.all() is printed, not the results.
Serial vs. Parallel
In regards to my discussion with the OP in the comments, I wanted to also add a description of serial vs. parallel work. I'm not sure how familiar you are with the different concepts, so I'll give a pretty abstraction description of them.
First, I find it prudent to say that JS does not support parallel operations within the same JS environment. Specifically, unlike other languages where I can spin up a thread to perform any work in parallel, JS can only (appear to) perform work in parallel if something else is doing the work (I/O).
That being said, let's start off with a simple description of what serial vs. parallel looks like. Imagine, if you will, doing homework for 4 different classes. The time it takes to do each class's homework can be seen in the table below.
Class | Time
1 | 5
2 | 10
3 | 15
4 | 2
Naturally, the work you do will happen serially and look something like
You_Instance1: doClass1Homework() -> doClass2Homework() -> doClass3Homework() -> doClass4Homework()
Doing the homework serially would take 32 units of time. However, wouldn't be great if you could split yourself into 4 different instances of yourself? If that were the case, you could have an instance of yourself for each of your classes. This might look something like
You_Instance1: doClass1Homework()
You_Instance2: doClass2Homework()
You_Instance3: doClass3Homework()
You_Instance4: doClass4Homework()
Working in parallel, you can now finish your homework in 15 units of time! That's less than half the time.
"But wait," you say, "there has to be some disadvantage to splitting myself into multiple instances to do my homework or everybody would be doing it."
You are correct. There is some overhead to splitting yourself into multiple instances. Let's say that splitting yourself requires deep meditation and an out of body experience, which takes 5 units of time. Now finishing your homework would look something like:
You_Instance1: split() -> split() -> doClass1Homework()
You_Instance2: split() -> doClass2Homework()
You_Instance3: doClass3Homework()
You_Instance4: doClass4Homework()
Now instead of taking 15 units of time, completing your homework takes 25 units of time. This is still cheaper than doing all of your homework by yourself.
Summary (Skip here if you understand serial vs. parallel execution)
This may be a silly example, but this is exactly what serial vs. parallel execution looks like. The main advantage of parallel execution is that you can perform several long running tasks at the same time. Since multiple workers are doing something at the same time, the work gets done faster.
However, there are disadvantages. Two of the big ones are overhead and complexity. Executing code in parallel isn't free, no matter what environment/language you use. In JS, parallel execution can be quite expensive because this is achieved by sending a request to a server. Depending on various factors, the round trip can take 10s to 100s of milliseconds. That is extremely slow for modern computers. This is why parallel execution is usually reserved for long running processes (completing homework) or when it just cannot be avoided (loading data from disk or a server).
The other main disadvantage is the added complexity. Coordinating multiple tasks occurring in parallel can be difficult (Dining Philosophers, Consumer-Producer Problem, Starvation, Race Conditions). But in JS the complexity also comes from understanding the code (Callback Hell, understanding what gets executed when). As mentioned above, a set of instructions occurring after asynchronous code does not wait to execute until the asynchronous code completes (unless it occurs in a callback).
How could I get multiple instances of myself to do my homework in JS?
There are a couple of different ways to accomplish this. One way you could do this is by setting up 4 different servers. Let's call them class1Server, class2Server, class3Server, and class4Server. Now to make these servers do your homework in parallel, you would do something like this:
Promise.all([
startServer1(),
startServer2(),
startServer3(),
startServer4()
]).then(() => {
console.log("Homework done!");
}).catch(() => {
console.error("One of me had trouble completing my homework :(");
});
Promise.all() returns a Promise that either resolves when all of the Promises are resolved or rejects when one of them is rejected.
Both functions fetchPerson and printNames run in serial as you are awaiting the results. The Promise.all use is pointless in your case, since the both persons already have been awaited (resolved).
To fetch two persons in parallel:
const [p1, p2] = await Promise.all([fetchPerson(URL + '1'), fetchPerson(URL + '2')])
Given two async functions:
const foo = async () => {...}
const bar = async () => {...}
This is serial:
const x = await foo()
const y = await bar()
This is parallel:
const [x,y] = await Promise.all([foo(), bar()])

JS basic design when I have naturaly sync code but I need to call a few async functions

Could you please help me to understand javascirpt async hell?
I think I am missing something important ☹ The thing is that js examples and most of the answers on the internet are related to just one part of code – a small snippet. But applications are much more complicated.
I am not going write it directly in JS since I am more interested of the design and how to write it PROPERLY.
Imagine these functions in my application:
InsertTestData();
SelectDataFromDB_1(‘USERS’);
SelectDataFromDB_2(‘USER_CARS’,’USERS’);
FillCollections(‘USER’,’USER_CARS’);
DoTheWork();
DeleteData();
I did not provide any description for the functions but I think it is obvious based on names. They need to go in THIS SPECIFIC ORDER. Imagine that I need to run a select into the db to get USERS and then I need run a select to get USER_CARS for these USERS. So it must be really in this order (consider the same for other functions). The thing is that need to call 6 times Node/Mysql which is async but I need results in specific order. So how can I PROPERLY make that happen?
This could work:
/* not valid code I want to present the idea and keep it short */
InsertTestData(
Mysql.query(select, data, function(err,success)
{
SelectDataFromDB_1(‘USERS’); -- in that asyn function I will call the next procedure
}
));
SelectDataFromDB_1 (
Mysql.query(select, data, function(err,success)
{
SelectDataFromDB_2(‘USERS’); -- in that asyn function I will call the next procedure
}
));
SelectDataFromDB_2 (
Mysql.query(select, data, function(err,success)
{
FillCollections (‘USERS’); -- in that asyn function I will call the next procedure
}
));
etc..
I can “easily” chain it but this looks as a mess. I mean really mess.
I can use some scheduler/timmers to schedule and testing if the previous procedure is done)
Both of them are mess.
So, what is the proper way to do this;
Thank you,
AZOR
If you're using a recent version of Node, you can use ES2017's async/await syntax to have synchronous-looking code that's really asynchronous.
First, you need a version of Mysql.query that returns a promise, which is easily done with util.promisify:
const util = require('util');
// ...
const query = util.promisify(Mysql.query);
Then, wrap your code in an async function and use await:
(async () => {
try {
await InsertTestData();
await SelectDataFromDB_1(‘USERS’);
await SelectDataFromDB_2(‘USER_CARS’,’USERS’);
await FillCollections(‘USER’,’USER_CARS’);
await DoTheWork();
await DeleteData();
} catch (e) {
// Handle the fact an error occurred...
}
})();
...where your functions are async functions, e.g.:
async InsertTestData() {
await query("INSERT INTO ...");
}
Note the try/catch in the async wrapper, it's essential to handle errors, because otherwise if an error occurs you'll get an unhandled rejection notice (and future versions of Node may well terminate the process). (Why "unhandled rejections"? Because async functions are syntactic sugar for promises; an async function returns a promise.) You can either do that with the try/catch shown, or alternate by using .catch on the result of calling it:
(async () => {
await InsertTestData();
// ...
})().catch(e => {
// Handle the fact an error occurred
});

Need help related async functions execution flow (await, promises, node.js)

I will appreciate if you help me with the following case:
Given function:
async function getAllProductsData() {
try {
allProductsInfo = await getDataFromUri(cpCampaignsLink);
allProductsInfo = await getCpCampaignsIdsAndNamesData(allProductsInfo);
await Promise.all(allProductsInfo.map(async (item) => {
item.images = await getProductsOfCampaign(item.id);
}));
allProductsInfo = JSON.stringify(allProductsInfo);
console.log(allProductsInfo);
return allProductsInfo;
} catch(err) {
handleErr(err);
}
}
That function is fired when server is started and it gathers campaigns information from other site: gets data(getDataFromUri()), then extracts from data name and id(getCpCampaignsIdsAndNamesData()), then gets products images for each campaign (getProductsOfCampaign());
I have also express.app with following piece of code:
app.get('/products', async (req, res) => {
if (allProductsInfo.length === undefined) {
console.log('Pending...');
allProductsInfo = await getAllProductsData();
}
res.status(200).send(allProductsInfo);
});
Problem description:
Launch server, wait few seconds until getAllProductsData() gets executed, do '/products' GET request, all works great!
Launch server and IMMEDIATELY fire '/products' GET request' (for that purpose I added IF with console.log('Pending...') expression), I get corrupted result back: it contains all campaigns names, ids, but NO images arrays.
[{"id":"1111","name":"Some name"},{"id":"2222","name":"Some other name"}],
while I was expecting
[{"id":"1111","name":"Some name","images":["URI","URI"...]}...]
I will highly appreciate your help about:
Explaining the flow of what is happening with async execution, and why the result is being sent without waiting for images arrays to be added to object?
If you know some useful articles/specs part covering my topic, I will be thankful for.
Thanks.
There's a huge issue with using a global variable allProductsInfo, and then firing multiple concurrent functions that use it asynchronously. This creates race conditions of all kinds, and you have to consider yourself lucky that you got only not images data.
You can easily solve this by making allProductsInfo a local variable, or at least not use it to store the intermediate results from getDataFromUri and getCpCampaignsIdsAndNamesData - use different (local!) variables for those.
However, even if you do that, you're potentially firing getAllProductsData multiple times, which should not lead to errors but is still inefficient. It's much easier to store a promise in the global variable, initialise this once with a single call to the info gathering procedure, and just to await it every time - which won't be noticeable when it's already fulfilled.
async function getAllProductsData() {
const data = await getDataFromUri(cpCampaignsLink);
const allProductsInfo = await getCpCampaignsIdsAndNamesData(allProductsInfo);
await Promise.all(allProductsInfo.map(async (item) => {
item.images = await getProductsOfCampaign(item.id);
}));
console.log(allProductsInfo);
return JSON.stringify(allProductsInfo);
}
const productDataPromise = getAllProductsData();
productDataPromise.catch(handleErr);
app.get('/products', async (req, res) => {
res.status(200).send(await productDataPromise);
});
Of course you might also want to start your server (or add the /products route to it) only after the data is loaded, and simply serve status 500 until then. Also you should consider what happens when the route is hit after the promise is rejected - not sure what handleErr does.

Categories

Resources