Need help related async functions execution flow (await, promises, node.js)

Need help related async functions execution flow (await, promises, node.js) - javascript

I will appreciate if you help me with the following case:
Given function:
async function getAllProductsData() {
try {
allProductsInfo = await getDataFromUri(cpCampaignsLink);
allProductsInfo = await getCpCampaignsIdsAndNamesData(allProductsInfo);
await Promise.all(allProductsInfo.map(async (item) => {
item.images = await getProductsOfCampaign(item.id);
}));
allProductsInfo = JSON.stringify(allProductsInfo);
console.log(allProductsInfo);
return allProductsInfo;
} catch(err) {
handleErr(err);
}
}
That function is fired when server is started and it gathers campaigns information from other site: gets data(getDataFromUri()), then extracts from data name and id(getCpCampaignsIdsAndNamesData()), then gets products images for each campaign (getProductsOfCampaign());
I have also express.app with following piece of code:
app.get('/products', async (req, res) => {
if (allProductsInfo.length === undefined) {
console.log('Pending...');
allProductsInfo = await getAllProductsData();
}
res.status(200).send(allProductsInfo);
});
Problem description:
Launch server, wait few seconds until getAllProductsData() gets executed, do '/products' GET request, all works great!
Launch server and IMMEDIATELY fire '/products' GET request' (for that purpose I added IF with console.log('Pending...') expression), I get corrupted result back: it contains all campaigns names, ids, but NO images arrays.
[{"id":"1111","name":"Some name"},{"id":"2222","name":"Some other name"}],
while I was expecting
[{"id":"1111","name":"Some name","images":["URI","URI"...]}...]
I will highly appreciate your help about:
Explaining the flow of what is happening with async execution, and why the result is being sent without waiting for images arrays to be added to object?
If you know some useful articles/specs part covering my topic, I will be thankful for.
Thanks.

There's a huge issue with using a global variable allProductsInfo, and then firing multiple concurrent functions that use it asynchronously. This creates race conditions of all kinds, and you have to consider yourself lucky that you got only not images data.
You can easily solve this by making allProductsInfo a local variable, or at least not use it to store the intermediate results from getDataFromUri and getCpCampaignsIdsAndNamesData - use different (local!) variables for those.
However, even if you do that, you're potentially firing getAllProductsData multiple times, which should not lead to errors but is still inefficient. It's much easier to store a promise in the global variable, initialise this once with a single call to the info gathering procedure, and just to await it every time - which won't be noticeable when it's already fulfilled.
async function getAllProductsData() {
const data = await getDataFromUri(cpCampaignsLink);
const allProductsInfo = await getCpCampaignsIdsAndNamesData(allProductsInfo);
await Promise.all(allProductsInfo.map(async (item) => {
item.images = await getProductsOfCampaign(item.id);
}));
console.log(allProductsInfo);
return JSON.stringify(allProductsInfo);
}
const productDataPromise = getAllProductsData();
productDataPromise.catch(handleErr);
app.get('/products', async (req, res) => {
res.status(200).send(await productDataPromise);
});
Of course you might also want to start your server (or add the /products route to it) only after the data is loaded, and simply serve status 500 until then. Also you should consider what happens when the route is hit after the promise is rejected - not sure what handleErr does.

Related

Cannot modify a WriteBatch that has been committed in environment nodejs

I am trying to batch commit but it does not work as expected here is my code
exports.test = functions.https.onRequest(async (req, res) => {
const firestore = admin.firestore();
const batch = firestore.batch();
try {
let snapUser = await admin.firestore().collection("users").get();
const batch = firestore.batch();
snapUser.docs.forEach(async (doc) => {
let snapObject = await doc.ref.collection("objects").get();
if (!objectSnapShot.empty) {
objectSnapShot.docs.forEach((doc) => {
batch.set(
doc.ref,
{
category: doc.data().category,
},
{
merge: true,
}
);
});
await batch.commit()
}
});
res.status(200).send("success");
} catch (error) {
functions.logger.log("Error while deleting account", error);
}
});
Am I doing something wrong ?
Any help would be appreciated

The await batch.commit() runs in every iteration of the loop but you can only commit a batch once. You must commit it after the execution of the loop.
snapUser.docs.forEach(async (doc) => {
// add operations to batch
});
await batch.commit()
Also, it might be better to use for loop when using async-await instead of a forEach() loop. Checkout Using async/await with a forEach loop for more information.

You can receive this errors in cases where transactions have been committed and or
you are using the same batch object to perform several commits. You can go ahead and create a new batch object for every commit, if necessary. You may try to start small and break things down to really understand the concepts.
It seems the WriteBatch is being closed before asynchronous operations try to write to it.
A batch process needs to have two parts, a main function and a callback from each batch.
The issue is you are trying to redeclare the batch process once it's been committed for the first batch.
Your main function is an async which will continue to run all methods asynchronously, ignoring any internal functions. You must make sure that all jobs being applied to the batch are added before attempting to commit, at the core this is a race condition.
You can handle these in multiple ways, such as wrapping it with a promise chain or creating a callback.
Check the following documentation for Managing Data Transactions.

Javascript. Order of execution of asynchronous functions

Just now I had a discussion with my team-lead, and I have some doubts about his words, looking for professionals' help.
For example, we have three async functions
const fetchViewers = async () => {
const viewers = await fetch(...);
this.setState({ viewers });
};
const fetchPolls = async () => {
const polls = await fetch(...);
this.setState({ polls });
};
const fetchRegistrants = async () => {
const registrants = await fetch(...);
this.setState({ registrants })
};
And we are invoking them in such order
const init = () => {
fetchViewers();
fetchPolls();
fetchRegistrants();
}
And let's say that fetching viewers takes far more time than two others,
my question, is there any reason to put fetchViewers last?
Since we are not waiting for them to be resolved in the init function, I'm pretty sure that it doesn't matter because it only affects the order it will be put in the stack, and the calls will be made by the DOM.
If it does matter, please explain more detailed why.

Asynchronous functions still run synchronously till the first await. Therefore if you do some long running preparations before the actual asynchronous action, the order does matter. Also if the asynchronous task is accessing a shared ressource (e.g. they are locking the same database, for example) the order could influence how well the tasks can run in parallel (this is outside JS' scope though). In the case given however I can't really see synchronous code / a shared ressource (except for bandwith, but that should hardly matter), so it should not matter. To give an absolute answer, shuffle the calls (6 combinations, so that's not that much work) and measure it.

A specific confusion on writing an async function. Can anyone validate?

I am very new to NodeJS. I was trying to get a function written that can simply return a configuration value from DB. I might need to call it multiple times.
In PHP or other synchronous languages, I would use the following code for it
function getConfigValue($configKeyName)
{
// DB READ OPERATIONS
return $confguration_value_fetched_from_db
}
getConfigValue("key1");
getConfigValue("key2");
etc
But in NodeJS, I found it too difficult to do this operation because of the Asynchronous nature of the code. After asking some questions here, and after spending hours to learn Callbacks, Promises, Async/await keywords, being a beginner the below is best code I could reach.
// Below function defines the 'get' function
var get = async (key) => {
var result = await COLLECTIONNAME.findOne({key}); //MongoDB in backend
return result.value;
}
// Here I am forced to define another async function so that I can await for the get function.
function async anotherfunction()
{
var value_I_am_lookingfor1 = await get("key1");
var value_I_am_lookingfor2 = await get("key2");
}
anotherfunction();
While it might work, I am not fully happy with the result, mainly because I really don't want to do all my further coding based on the fetched value within this function anotherfunction(). All I want is to fetch a single value? Also I might need to easily call it from many places within the application, not just from here (I was planning to place it in a module)
Any better or easier methods? Or should I always get the value I want, and then nest it with a 'then.' to do the subsequent operation? I even doubt the fundamental approach I take on NodeJS coding itself may be wrong.
Can anyone guide me?

mainly because I really don't want to do all my further coding based on the fetched value within this function anotherfunction(). All I want is to fetch a single value?
Because the request is asynchronous, and your code depends on having the fetched values first, there's no option other than to wait for the values to be retrieved before continuing. Somewhere in the script, control flow needs to halt until the values are retrieved before other parts of the script continue.
Also I might need to easily call it from many places within the application, not just from here (I was planning to place it in a module)
You should have that module make the requests and export a Promise that resolves to the values needed. Rather than using await (which forces requests to be processed in serial), you should probably use Promise.all, which will allow multiple requests to be sent out at once. For example:
valuegetter.js
const get = key => COLLECTIONNAME.findOne({ key }).then(res => res.value);
export default Promise.all([
get('key1'),
get('key2')
]);
main.js
import prom from './valuegetter.js';
prom.then(([val1, val2]) => {
// do stuff with val1 and val2
})
.catch((err) => {
// handle errors
});
If other modules need val1 and val2, call them from main.js with the values they need.

Difference between Serial and parallel

I am fetching some persons from an API and then executing them in parallel, how would the code look if i did it in serial?
Not even sure, if below is in parallel, i'm having a hard time figuring out the difference between the two of them.
I guess serial is one by one, and parallel (promise.all) is waiting for all promises to be resolved before it puts the value in finalResult?
Is this understood correct?
Below is a snippet of my code.
Thanks in advance.
const fetch = require('node-fetch')
const URL = "https://swapi.co/api/people/";
async function fetchPerson(url){
const result = await fetch(url);
const data = await result.json().then((data)=>data);
return data ;
}
async function printNames() {
const person1 = await fetchPerson(URL+1);
const person2 = await fetchPerson(URL+2);
let finalResult =await Promise.all([person1.name, person2.name]);
console.log(finalResult);
}
printNames().catch((e)=>{
console.log('There was an error :', e)
});

Let me translate this code into a few different versions. First, the original version, but with extra work removed:
const fetch = require('node-fetch')
const URL = "https://swapi.co/api/people/";
async function fetchPerson(url){
const result = await fetch(url);
return await result.json();
}
async function printNames() {
const person1 = await fetchPerson(URL+1);
const person2 = await fetchPerson(URL+2);
console.log([person1.name, person2.name]);
}
try {
await printNames();
} catch(error) {
console.error(error);
}
The code above is equivalent to the original code you posted. Now to get a better understanding of what's going on here, let's translate this to the exact same code pre-async/await.
const fetch = require('node-fetch')
const URL = "https://swapi.co/api/people/";
function fetchPerson(url){
return fetch(url).then((result) => {
return result.json();
});
}
function printNames() {
let results = [];
return fetchPerson(URL+1).then((person1) => {
results.push(person1.name);
return fetchPerson(URL+2);
}).then((person2) => {
results.push(person2.name);
console.log(results);
});
}
printNames().catch((error) => {
console.error(error);
});
The code above is equivalent to the original code you posted, I just did the extra work the JS translator will do. I feel like this makes it a little more clear. Following the code above, we will do the following:
Call printNames()
printNames() will request person1 and wait for the response.
printNames() will request person2 and wait for the response.
printNames() will print the results
As you can imagine, this can be improved. Let's request both persons at the same time. We can do that with the following code
const fetch = require('node-fetch')
const URL = "https://swapi.co/api/people/";
function fetchPerson(url){
return fetch(url).then((result) => {
return result.json();
});
}
function printNames() {
return Promise.all([fetchPerson(URL+1).then((person1) => person1.name), fetchPerson(URL+2).then((person2) => person2.name)]).then((results) => {
console.log(results);
});
}
printNames().catch((error) => {
console.error(error);
});
This code is NOT equivalent to the original code posted. Instead of performing everything serially, we are now fetching the different users in parallel. Now our code does the following
Call printNames()
printNames() will
Send a request for person1
Send a request for person2
printNames() will wait for a response from both of the requests
printNames() will print the results
The moral of the story is that async/await is not a replacement for Promises in all situations, it is syntactic sugar to make a very specific way of handling Promises easier. If you want/can perform tasks in parallel, don't use async/await on each individual Promise. You can instead do something like the following:
const fetch = require('node-fetch')
const URL = "https://swapi.co/api/people/";
async function fetchPerson(url){
const result = await fetch(url);
return result.json();
}
async function printNames() {
const [ person1, person2 ] = await Promise.all([
fetchPerson(URL+1),
fetchPerson(URL+2),
]);
console.log([person1.name, person2.name]);
}
printNames().catch((error) => {
console.error(error);
});
Disclaimer
printNames() (and everything else) doesn't really wait for anything. It will continue executing any and all code that comes after the I/O that does not appear in a callback for the I/O. await simply creates a callback of the remaining code that is called when the I/O has finished. For example, the following code snippet will not do what you expect.
const results = Promise.all([promise1, promise2]);
console.log(results); // Gets executed immediately, so the Promise returned by Promise.all() is printed, not the results.
Serial vs. Parallel
In regards to my discussion with the OP in the comments, I wanted to also add a description of serial vs. parallel work. I'm not sure how familiar you are with the different concepts, so I'll give a pretty abstraction description of them.
First, I find it prudent to say that JS does not support parallel operations within the same JS environment. Specifically, unlike other languages where I can spin up a thread to perform any work in parallel, JS can only (appear to) perform work in parallel if something else is doing the work (I/O).
That being said, let's start off with a simple description of what serial vs. parallel looks like. Imagine, if you will, doing homework for 4 different classes. The time it takes to do each class's homework can be seen in the table below.
Class | Time
1 | 5
2 | 10
3 | 15
4 | 2
Naturally, the work you do will happen serially and look something like
You_Instance1: doClass1Homework() -> doClass2Homework() -> doClass3Homework() -> doClass4Homework()
Doing the homework serially would take 32 units of time. However, wouldn't be great if you could split yourself into 4 different instances of yourself? If that were the case, you could have an instance of yourself for each of your classes. This might look something like
You_Instance1: doClass1Homework()
You_Instance2: doClass2Homework()
You_Instance3: doClass3Homework()
You_Instance4: doClass4Homework()
Working in parallel, you can now finish your homework in 15 units of time! That's less than half the time.
"But wait," you say, "there has to be some disadvantage to splitting myself into multiple instances to do my homework or everybody would be doing it."
You are correct. There is some overhead to splitting yourself into multiple instances. Let's say that splitting yourself requires deep meditation and an out of body experience, which takes 5 units of time. Now finishing your homework would look something like:
You_Instance1: split() -> split() -> doClass1Homework()
You_Instance2: split() -> doClass2Homework()
You_Instance3: doClass3Homework()
You_Instance4: doClass4Homework()
Now instead of taking 15 units of time, completing your homework takes 25 units of time. This is still cheaper than doing all of your homework by yourself.
Summary (Skip here if you understand serial vs. parallel execution)
This may be a silly example, but this is exactly what serial vs. parallel execution looks like. The main advantage of parallel execution is that you can perform several long running tasks at the same time. Since multiple workers are doing something at the same time, the work gets done faster.
However, there are disadvantages. Two of the big ones are overhead and complexity. Executing code in parallel isn't free, no matter what environment/language you use. In JS, parallel execution can be quite expensive because this is achieved by sending a request to a server. Depending on various factors, the round trip can take 10s to 100s of milliseconds. That is extremely slow for modern computers. This is why parallel execution is usually reserved for long running processes (completing homework) or when it just cannot be avoided (loading data from disk or a server).
The other main disadvantage is the added complexity. Coordinating multiple tasks occurring in parallel can be difficult (Dining Philosophers, Consumer-Producer Problem, Starvation, Race Conditions). But in JS the complexity also comes from understanding the code (Callback Hell, understanding what gets executed when). As mentioned above, a set of instructions occurring after asynchronous code does not wait to execute until the asynchronous code completes (unless it occurs in a callback).
How could I get multiple instances of myself to do my homework in JS?
There are a couple of different ways to accomplish this. One way you could do this is by setting up 4 different servers. Let's call them class1Server, class2Server, class3Server, and class4Server. Now to make these servers do your homework in parallel, you would do something like this:
Promise.all([
startServer1(),
startServer2(),
startServer3(),
startServer4()
]).then(() => {
console.log("Homework done!");
}).catch(() => {
console.error("One of me had trouble completing my homework :(");
});
Promise.all() returns a Promise that either resolves when all of the Promises are resolved or rejects when one of them is rejected.

Both functions fetchPerson and printNames run in serial as you are awaiting the results. The Promise.all use is pointless in your case, since the both persons already have been awaited (resolved).
To fetch two persons in parallel:
const [p1, p2] = await Promise.all([fetchPerson(URL + '1'), fetchPerson(URL + '2')])
Given two async functions:
const foo = async () => {...}
const bar = async () => {...}
This is serial:
const x = await foo()
const y = await bar()
This is parallel:
const [x,y] = await Promise.all([foo(), bar()])

Move object/variable away from async function

I understand that this is a basic question, but I can't figure it out myself, how to export my variable "X" (which is actually a JSON object) out of "for" cycle. I have tried a various ways, but in my case function return not the JSON.object itself, but a "promise.pending".
I guess that someone more expirienced with this will help me out. My code:
for (let i = 0; i < server.length; i++) {
const fetch = require("node-fetch");
const url = ''+(server[i].name)+'';
const getData = async url => {
try {
const response = await fetch(url);
return await response.json();
} catch (error) {
console.log(error);
}
};
getData(url).then(function(result) { //promise.pending w/o .then
let x = result; //here is real JSON that I want to export
});
}
console.log(x); // -element is not exported :(

Here's some cleaner ES6 code you may wish to try:
const fetch = require("node-fetch");
Promise.all(
server.map((srv) => {
const url = String(srv.name);
return fetch(url)
.then((response) => response.json())
.catch((err) => console.log(err));
})
)
.then((results) => {
console.log(results);
})
.catch((err) => {
console.log('total failure!');
console.log(err);
});
How does it work?
Using Array.map, it transforms the list of servers into a list of promises which are executed in parallel. Each promise does two things:
fetch the URL
extract JSON response
If either step fails, that one promise rejects, which will then cause the whole series to reject immediately.
Why do I think this is better than the accepted answer? In a word, it's cleaner. It doesn't mix explicit promises with async/await, which can make asynchronous logic muddier than necessary. It doesn't import the fetch library on every loop iteration. It converts the server URL to a string explicitly, rather than relying on implicit coercion. It doesn't create unnecessary variables, and it avoids the needless for loop.
Whether you accept it or not, I offer it up as another view on the same problem, solved in what I think is a maximally elegant and clear way.
Why is this so hard? Why is async work so counterintuitive?
Doing async work requires being comfortable with something known as "continuation passing style." An asynchronous task is, by definition, non-blocking -- program execution does not wait for the task to complete before moving to the next statement. But we often do async work because subsequent statements require data that is not yet available. Thus, we have the callback function, then the Promise, and now async/await. The first two solve the problem with a mechanism that allows you to provide "packages" of work to do once an asynchronous task is complete -- "continuations," where execution will resume once some condition obtains. There is absolutely no difference between a boring node-style callback function and the .then of a Promise: both accept functions, and both will execute those functions at specific times and with specific data. The key job of the callback function is to act as a receptacle for data about the asynchronous task.
This pattern complicates not only basic variable scoping, which was your main concern, but also the issue of how best to express complicated workflows, which are often a mix of blocking and non-blocking statements. If doing async work requires providing lots of "continuations" in the form of functions, then we know that doing this work will be a constant battle against the proliferation of a million little functions, a million things needing names that must be unique and clear. This is a problem that cannot be solved with a library. It requires adapting one's style to the changed terrain.
The less your feet touch the ground, the better. :)

Javascript builds on the concept of promises. When you ask getData to to do its work, what is says is that, "OK, this is going to take some time, but I promise that I'll let you know after the work is done. So please have faith on my promise, I'll let you know once the work is complete", and it immediately gives you a promise to you.
That's what you see as promise.pending. It's pending because it is not completed yet. Now you should register a certain task (or function) with that promise for getData to call when he completes the work.
function doSomething(){
var promiseArray = [];
for (let i = 0; i < server.length; i++) {
const fetch = require("node-fetch");
const url = ''+(server[i].name)+'';
const getData = async url => {
try {
const response = await fetch(url);
return await response.json();
} catch (error) {
console.log(error);
}
};
promiseArray.push(getData(url)); // keeping track of all promises
}
return Promise.all(promiseArray); //see, I'm not registering anything to promise, I'm passing it to the consumer
}
function successCallback(result) {
console.log("It succeeded with " + result);
}
function failureCallback(error) {
console.log("It failed with " + error);
}
let promise = doSomething(); // do something is the function that does all the logic in that for loop and getData
promise.then(successCallback, failureCallback);

Develop Reference

JavaScript is the programming language of the Web.

Need help related async functions execution flow (await, promises, node.js) - javascript

Related

Cannot modify a WriteBatch that has been committed in environment nodejs

Javascript. Order of execution of asynchronous functions

A specific confusion on writing an async function. Can anyone validate?

Difference between Serial and parallel

Move object/variable away from async function

Categories

Resources