I am fetching some persons from an API and then executing them in parallel, how would the code look if i did it in serial?
Not even sure, if below is in parallel, i'm having a hard time figuring out the difference between the two of them.
I guess serial is one by one, and parallel (promise.all) is waiting for all promises to be resolved before it puts the value in finalResult?
Is this understood correct?
Below is a snippet of my code.
Thanks in advance.
const fetch = require('node-fetch')
const URL = "https://swapi.co/api/people/";
async function fetchPerson(url){
const result = await fetch(url);
const data = await result.json().then((data)=>data);
return data ;
}
async function printNames() {
const person1 = await fetchPerson(URL+1);
const person2 = await fetchPerson(URL+2);
let finalResult =await Promise.all([person1.name, person2.name]);
console.log(finalResult);
}
printNames().catch((e)=>{
console.log('There was an error :', e)
});
Let me translate this code into a few different versions. First, the original version, but with extra work removed:
const fetch = require('node-fetch')
const URL = "https://swapi.co/api/people/";
async function fetchPerson(url){
const result = await fetch(url);
return await result.json();
}
async function printNames() {
const person1 = await fetchPerson(URL+1);
const person2 = await fetchPerson(URL+2);
console.log([person1.name, person2.name]);
}
try {
await printNames();
} catch(error) {
console.error(error);
}
The code above is equivalent to the original code you posted. Now to get a better understanding of what's going on here, let's translate this to the exact same code pre-async/await.
const fetch = require('node-fetch')
const URL = "https://swapi.co/api/people/";
function fetchPerson(url){
return fetch(url).then((result) => {
return result.json();
});
}
function printNames() {
let results = [];
return fetchPerson(URL+1).then((person1) => {
results.push(person1.name);
return fetchPerson(URL+2);
}).then((person2) => {
results.push(person2.name);
console.log(results);
});
}
printNames().catch((error) => {
console.error(error);
});
The code above is equivalent to the original code you posted, I just did the extra work the JS translator will do. I feel like this makes it a little more clear. Following the code above, we will do the following:
Call printNames()
printNames() will request person1 and wait for the response.
printNames() will request person2 and wait for the response.
printNames() will print the results
As you can imagine, this can be improved. Let's request both persons at the same time. We can do that with the following code
const fetch = require('node-fetch')
const URL = "https://swapi.co/api/people/";
function fetchPerson(url){
return fetch(url).then((result) => {
return result.json();
});
}
function printNames() {
return Promise.all([fetchPerson(URL+1).then((person1) => person1.name), fetchPerson(URL+2).then((person2) => person2.name)]).then((results) => {
console.log(results);
});
}
printNames().catch((error) => {
console.error(error);
});
This code is NOT equivalent to the original code posted. Instead of performing everything serially, we are now fetching the different users in parallel. Now our code does the following
Call printNames()
printNames() will
Send a request for person1
Send a request for person2
printNames() will wait for a response from both of the requests
printNames() will print the results
The moral of the story is that async/await is not a replacement for Promises in all situations, it is syntactic sugar to make a very specific way of handling Promises easier. If you want/can perform tasks in parallel, don't use async/await on each individual Promise. You can instead do something like the following:
const fetch = require('node-fetch')
const URL = "https://swapi.co/api/people/";
async function fetchPerson(url){
const result = await fetch(url);
return result.json();
}
async function printNames() {
const [ person1, person2 ] = await Promise.all([
fetchPerson(URL+1),
fetchPerson(URL+2),
]);
console.log([person1.name, person2.name]);
}
printNames().catch((error) => {
console.error(error);
});
Disclaimer
printNames() (and everything else) doesn't really wait for anything. It will continue executing any and all code that comes after the I/O that does not appear in a callback for the I/O. await simply creates a callback of the remaining code that is called when the I/O has finished. For example, the following code snippet will not do what you expect.
const results = Promise.all([promise1, promise2]);
console.log(results); // Gets executed immediately, so the Promise returned by Promise.all() is printed, not the results.
Serial vs. Parallel
In regards to my discussion with the OP in the comments, I wanted to also add a description of serial vs. parallel work. I'm not sure how familiar you are with the different concepts, so I'll give a pretty abstraction description of them.
First, I find it prudent to say that JS does not support parallel operations within the same JS environment. Specifically, unlike other languages where I can spin up a thread to perform any work in parallel, JS can only (appear to) perform work in parallel if something else is doing the work (I/O).
That being said, let's start off with a simple description of what serial vs. parallel looks like. Imagine, if you will, doing homework for 4 different classes. The time it takes to do each class's homework can be seen in the table below.
Class | Time
1 | 5
2 | 10
3 | 15
4 | 2
Naturally, the work you do will happen serially and look something like
You_Instance1: doClass1Homework() -> doClass2Homework() -> doClass3Homework() -> doClass4Homework()
Doing the homework serially would take 32 units of time. However, wouldn't be great if you could split yourself into 4 different instances of yourself? If that were the case, you could have an instance of yourself for each of your classes. This might look something like
You_Instance1: doClass1Homework()
You_Instance2: doClass2Homework()
You_Instance3: doClass3Homework()
You_Instance4: doClass4Homework()
Working in parallel, you can now finish your homework in 15 units of time! That's less than half the time.
"But wait," you say, "there has to be some disadvantage to splitting myself into multiple instances to do my homework or everybody would be doing it."
You are correct. There is some overhead to splitting yourself into multiple instances. Let's say that splitting yourself requires deep meditation and an out of body experience, which takes 5 units of time. Now finishing your homework would look something like:
You_Instance1: split() -> split() -> doClass1Homework()
You_Instance2: split() -> doClass2Homework()
You_Instance3: doClass3Homework()
You_Instance4: doClass4Homework()
Now instead of taking 15 units of time, completing your homework takes 25 units of time. This is still cheaper than doing all of your homework by yourself.
Summary (Skip here if you understand serial vs. parallel execution)
This may be a silly example, but this is exactly what serial vs. parallel execution looks like. The main advantage of parallel execution is that you can perform several long running tasks at the same time. Since multiple workers are doing something at the same time, the work gets done faster.
However, there are disadvantages. Two of the big ones are overhead and complexity. Executing code in parallel isn't free, no matter what environment/language you use. In JS, parallel execution can be quite expensive because this is achieved by sending a request to a server. Depending on various factors, the round trip can take 10s to 100s of milliseconds. That is extremely slow for modern computers. This is why parallel execution is usually reserved for long running processes (completing homework) or when it just cannot be avoided (loading data from disk or a server).
The other main disadvantage is the added complexity. Coordinating multiple tasks occurring in parallel can be difficult (Dining Philosophers, Consumer-Producer Problem, Starvation, Race Conditions). But in JS the complexity also comes from understanding the code (Callback Hell, understanding what gets executed when). As mentioned above, a set of instructions occurring after asynchronous code does not wait to execute until the asynchronous code completes (unless it occurs in a callback).
How could I get multiple instances of myself to do my homework in JS?
There are a couple of different ways to accomplish this. One way you could do this is by setting up 4 different servers. Let's call them class1Server, class2Server, class3Server, and class4Server. Now to make these servers do your homework in parallel, you would do something like this:
Promise.all([
startServer1(),
startServer2(),
startServer3(),
startServer4()
]).then(() => {
console.log("Homework done!");
}).catch(() => {
console.error("One of me had trouble completing my homework :(");
});
Promise.all() returns a Promise that either resolves when all of the Promises are resolved or rejects when one of them is rejected.
Both functions fetchPerson and printNames run in serial as you are awaiting the results. The Promise.all use is pointless in your case, since the both persons already have been awaited (resolved).
To fetch two persons in parallel:
const [p1, p2] = await Promise.all([fetchPerson(URL + '1'), fetchPerson(URL + '2')])
Given two async functions:
const foo = async () => {...}
const bar = async () => {...}
This is serial:
const x = await foo()
const y = await bar()
This is parallel:
const [x,y] = await Promise.all([foo(), bar()])
Related
Just now I had a discussion with my team-lead, and I have some doubts about his words, looking for professionals' help.
For example, we have three async functions
const fetchViewers = async () => {
const viewers = await fetch(...);
this.setState({ viewers });
};
const fetchPolls = async () => {
const polls = await fetch(...);
this.setState({ polls });
};
const fetchRegistrants = async () => {
const registrants = await fetch(...);
this.setState({ registrants })
};
And we are invoking them in such order
const init = () => {
fetchViewers();
fetchPolls();
fetchRegistrants();
}
And let's say that fetching viewers takes far more time than two others,
my question, is there any reason to put fetchViewers last?
Since we are not waiting for them to be resolved in the init function, I'm pretty sure that it doesn't matter because it only affects the order it will be put in the stack, and the calls will be made by the DOM.
If it does matter, please explain more detailed why.
Asynchronous functions still run synchronously till the first await. Therefore if you do some long running preparations before the actual asynchronous action, the order does matter. Also if the asynchronous task is accessing a shared ressource (e.g. they are locking the same database, for example) the order could influence how well the tasks can run in parallel (this is outside JS' scope though). In the case given however I can't really see synchronous code / a shared ressource (except for bandwith, but that should hardly matter), so it should not matter. To give an absolute answer, shuffle the calls (6 combinations, so that's not that much work) and measure it.
I want to construct an object from multiple promise results. The only way to do this is something like this:
const allPromises = await Promise.all(asyncResult1, asyncResult2);
allPromises.then([result1, result2] => {
return {result1, result2};
})
A nicer way (especially when more data is being constructed) would be this, with the major drawback of the async requests happening one after the other.
const result = {
result1: await asyncResult1(),
result2: await asyncResult2(),
}
Is there a better way to construct my object with async data I am waiting for?
See my comment, it's a little unclear what asyncResult and asyncResult2 are, because your first code block uses them as though they were variables containing promises (presumably for things already in progress), but your second code block calls them as functions and uses the result (presumably a promise).
If they're variables containing promises
...for processes already underway, then there's no problem with awaiting them individually, since they're already underway:
const result = {
result1: await asyncResult1, // Note these aren't function calls
result2: await asyncResult2,
};
That won't make the second wait on the first, because they're both presumably already underway before this part of the code is reached.
If they're functions
...that you need to call, then yes, to call them both and use the results Promise.all is a good choice. Here's a tweak:
const result = await Promise.all([asyncResult1(), asyncResult2()])
.then(([result1, result2]) => ({result1, result2}));
One of the rare places where using then syntax in an async function isn't necessarily the wrong choice. :-) Alternately, just save the promises to variables and then use the first option above:
const ar1 = asyncResult1();
const ar2 = asyncResult2();
const result = {
result1: await ar1,
result2: await ar2,
};
I'm interested to understand why using await immediately blocks, instead of lazily blocking only once the awaited value is referenced.
I hope this is an illuminating/important question, but I'm worried it may be considered out of scope for this site - if it is I'll apologize, have a little cry, and take the question elsewhere.
E.g. consider the following:
let delayedValue = (value, ms=1000) => {
return new Promise(resolve => {
setTimeout(() => resolve(val), ms);
});
};
(async () => {
let value = await delayedValue('val');
console.log('After await');
})();
In the anonymous async function which immediately runs, we'll only see the console say After await after the delay. Why is this necessary? Considering that we don't need value to be resolved, why haven't the language designers decided to execute the console.log statement immediately in a case like this?
For example, it's unlike the following example where delaying console.log is obviously unavoidable (because the awaited value is referenced):
(async () => {
let value = await delayedValue('val');
console.log('After await: ' + value);
});
I see tons of advantages to lazy await blocking - it could lead to automatic parallelization of unrelated operations. E.g. if I want to read two files and then work with both of them, and I'm not being dilligent, I'll write the following:
(async() => {
let file1 = await readFile('file1dir');
let file2 = await readFile('file2dir');
// ... do things with `file1` and `file2` ...
});
This will wait to have read the 1st file before beginning to read the 2nd. But really they could be read in parallel, and javascript ought to be able to detect this because file1 isn't referenced until later. When I was first learning async/await my initial expectation was that code like the above would result in parallel operations, and I was a bit disappointed when that turned out to be false.
Getting the two files to read in parallel is STILL a bit messy, even in this beautiful world of ES7 because await blocks immediately instead of lazily. You need to do something like the following (which is certainly messier than the above):
(async() => {
let [ read1, read2] = [ readFile('file1dir'), readFile('file2dir') ];
let [ file1, file2 ] = [ await read1, await read2];
// ... do things with `file1` and `file2` ...
});
Why have the language designers chosen to have await block immediately instead of lazily?
E.g. Could it lead to debugging difficulties? Would it be too difficult to integrate lazy awaits into javascript? Are there situations I haven't thought of where lazy awaits lead to messier code, poorer performance, or other negative consequences?
Reason 1: JavaScript is not lazily evaluated. There is no infrastructure to detect "when a value is actually needed".
Reason 2: Implicit parallelism would be very hard to control. What if I wanted my files to be read sequentially, how would I write that? Changing little things in the syntax should not lead to very different evaluation.
Getting the two files to read in parallel is STILL a bit messy
Not at all:
const [file1, file2] = await Promise.all([readFile('file1dir'), readFile('file2dir')]);
You need to do something like the following
No, you absolutely shouldn't write it like that.
I understand that this is a basic question, but I can't figure it out myself, how to export my variable "X" (which is actually a JSON object) out of "for" cycle. I have tried a various ways, but in my case function return not the JSON.object itself, but a "promise.pending".
I guess that someone more expirienced with this will help me out. My code:
for (let i = 0; i < server.length; i++) {
const fetch = require("node-fetch");
const url = ''+(server[i].name)+'';
const getData = async url => {
try {
const response = await fetch(url);
return await response.json();
} catch (error) {
console.log(error);
}
};
getData(url).then(function(result) { //promise.pending w/o .then
let x = result; //here is real JSON that I want to export
});
}
console.log(x); // -element is not exported :(
Here's some cleaner ES6 code you may wish to try:
const fetch = require("node-fetch");
Promise.all(
server.map((srv) => {
const url = String(srv.name);
return fetch(url)
.then((response) => response.json())
.catch((err) => console.log(err));
})
)
.then((results) => {
console.log(results);
})
.catch((err) => {
console.log('total failure!');
console.log(err);
});
How does it work?
Using Array.map, it transforms the list of servers into a list of promises which are executed in parallel. Each promise does two things:
fetch the URL
extract JSON response
If either step fails, that one promise rejects, which will then cause the whole series to reject immediately.
Why do I think this is better than the accepted answer? In a word, it's cleaner. It doesn't mix explicit promises with async/await, which can make asynchronous logic muddier than necessary. It doesn't import the fetch library on every loop iteration. It converts the server URL to a string explicitly, rather than relying on implicit coercion. It doesn't create unnecessary variables, and it avoids the needless for loop.
Whether you accept it or not, I offer it up as another view on the same problem, solved in what I think is a maximally elegant and clear way.
Why is this so hard? Why is async work so counterintuitive?
Doing async work requires being comfortable with something known as "continuation passing style." An asynchronous task is, by definition, non-blocking -- program execution does not wait for the task to complete before moving to the next statement. But we often do async work because subsequent statements require data that is not yet available. Thus, we have the callback function, then the Promise, and now async/await. The first two solve the problem with a mechanism that allows you to provide "packages" of work to do once an asynchronous task is complete -- "continuations," where execution will resume once some condition obtains. There is absolutely no difference between a boring node-style callback function and the .then of a Promise: both accept functions, and both will execute those functions at specific times and with specific data. The key job of the callback function is to act as a receptacle for data about the asynchronous task.
This pattern complicates not only basic variable scoping, which was your main concern, but also the issue of how best to express complicated workflows, which are often a mix of blocking and non-blocking statements. If doing async work requires providing lots of "continuations" in the form of functions, then we know that doing this work will be a constant battle against the proliferation of a million little functions, a million things needing names that must be unique and clear. This is a problem that cannot be solved with a library. It requires adapting one's style to the changed terrain.
The less your feet touch the ground, the better. :)
Javascript builds on the concept of promises. When you ask getData to to do its work, what is says is that, "OK, this is going to take some time, but I promise that I'll let you know after the work is done. So please have faith on my promise, I'll let you know once the work is complete", and it immediately gives you a promise to you.
That's what you see as promise.pending. It's pending because it is not completed yet. Now you should register a certain task (or function) with that promise for getData to call when he completes the work.
function doSomething(){
var promiseArray = [];
for (let i = 0; i < server.length; i++) {
const fetch = require("node-fetch");
const url = ''+(server[i].name)+'';
const getData = async url => {
try {
const response = await fetch(url);
return await response.json();
} catch (error) {
console.log(error);
}
};
promiseArray.push(getData(url)); // keeping track of all promises
}
return Promise.all(promiseArray); //see, I'm not registering anything to promise, I'm passing it to the consumer
}
function successCallback(result) {
console.log("It succeeded with " + result);
}
function failureCallback(error) {
console.log("It failed with " + error);
}
let promise = doSomething(); // do something is the function that does all the logic in that for loop and getData
promise.then(successCallback, failureCallback);
I will appreciate if you help me with the following case:
Given function:
async function getAllProductsData() {
try {
allProductsInfo = await getDataFromUri(cpCampaignsLink);
allProductsInfo = await getCpCampaignsIdsAndNamesData(allProductsInfo);
await Promise.all(allProductsInfo.map(async (item) => {
item.images = await getProductsOfCampaign(item.id);
}));
allProductsInfo = JSON.stringify(allProductsInfo);
console.log(allProductsInfo);
return allProductsInfo;
} catch(err) {
handleErr(err);
}
}
That function is fired when server is started and it gathers campaigns information from other site: gets data(getDataFromUri()), then extracts from data name and id(getCpCampaignsIdsAndNamesData()), then gets products images for each campaign (getProductsOfCampaign());
I have also express.app with following piece of code:
app.get('/products', async (req, res) => {
if (allProductsInfo.length === undefined) {
console.log('Pending...');
allProductsInfo = await getAllProductsData();
}
res.status(200).send(allProductsInfo);
});
Problem description:
Launch server, wait few seconds until getAllProductsData() gets executed, do '/products' GET request, all works great!
Launch server and IMMEDIATELY fire '/products' GET request' (for that purpose I added IF with console.log('Pending...') expression), I get corrupted result back: it contains all campaigns names, ids, but NO images arrays.
[{"id":"1111","name":"Some name"},{"id":"2222","name":"Some other name"}],
while I was expecting
[{"id":"1111","name":"Some name","images":["URI","URI"...]}...]
I will highly appreciate your help about:
Explaining the flow of what is happening with async execution, and why the result is being sent without waiting for images arrays to be added to object?
If you know some useful articles/specs part covering my topic, I will be thankful for.
Thanks.
There's a huge issue with using a global variable allProductsInfo, and then firing multiple concurrent functions that use it asynchronously. This creates race conditions of all kinds, and you have to consider yourself lucky that you got only not images data.
You can easily solve this by making allProductsInfo a local variable, or at least not use it to store the intermediate results from getDataFromUri and getCpCampaignsIdsAndNamesData - use different (local!) variables for those.
However, even if you do that, you're potentially firing getAllProductsData multiple times, which should not lead to errors but is still inefficient. It's much easier to store a promise in the global variable, initialise this once with a single call to the info gathering procedure, and just to await it every time - which won't be noticeable when it's already fulfilled.
async function getAllProductsData() {
const data = await getDataFromUri(cpCampaignsLink);
const allProductsInfo = await getCpCampaignsIdsAndNamesData(allProductsInfo);
await Promise.all(allProductsInfo.map(async (item) => {
item.images = await getProductsOfCampaign(item.id);
}));
console.log(allProductsInfo);
return JSON.stringify(allProductsInfo);
}
const productDataPromise = getAllProductsData();
productDataPromise.catch(handleErr);
app.get('/products', async (req, res) => {
res.status(200).send(await productDataPromise);
});
Of course you might also want to start your server (or add the /products route to it) only after the data is loaded, and simply serve status 500 until then. Also you should consider what happens when the route is hit after the promise is rejected - not sure what handleErr does.