I have a piece of code that executes a long list of http requests, and I've written the code in such a way that it always has 4 requests running parallel. It just so happens that the server can handle 4 parallel requests the fastest. With less the code would work slower and with more the requests would take longer to finish. Anyway, here is the code:
const itemsToRemove = items.filter(
// ...
)
const removeItem = (item: Item) => item && // first it checks if item isn't undefined
// Then it creates a DELETE request
itemsApi.remove(item).then(
// And then whenever a request finishes,
// it adds the next request to the queue.
// This ensures that there will always
// be 4 requests running parallel.
() => removeItem(itemsToRemove.shift())
)
// I start with a chunk of the first 4 items.
const firstChunk = itemsToRemove.splice(0, 4)
await Promise.allSettled(
firstChunk.map(removeItem)
)
Now the problem with this code is that if list is very long (as in thousands of items), at some point the browser tab just crashes. Which is a little unhelpful, because I don't get to see a specific error message that tells me what went wrong.
But my guess is that this part of the code:
itemsApi.remove(item).then(
() => removeItem(itemsToRemove.shift())
)
May be creating a Maximum call stack size exceeded issue? Because in a way I'm constantly adding to the call stack, aren't I?
Do you think my guess is correct? And regardless of if your answer is yes or no, do you have an idea how I could achieve the same goal without crashing the browser tab? Can I refactor this code in a way that doesn't add to the call stack? (If I'm indeed doing that?)
The issue with your code is in
await Promise.allSettled(firstChunk.map(removeItem)
The argument passed to Promise.allSettled needs to be an array of Promises as per the documentation:
The Promise.allSettled() method returns a promise that fulfills after all of the given promises have either fulfilled or rejected, with an array of objects that each describes the outcome of each promise.
Your recursive function then runs all of the requests one after the other throwing a Maximum call stack size exceeded error and crashing your browser.
The solution I came up with (it could probably be shortened) is like so:
let items = []
while (items.length < 20) {
items = [...items, `item-${items.length + 1}`]
}
// A mockup of the API function that executes an asynchronous task and returns once it is resolved
async function itemsApi(item) {
await new Promise((resolve) => setTimeout(() => {resolve(item)}, 1000))
}
async function f(items) {
const itemsToRemove = items
// call the itemsApi and resolve the promise after the itemsApi function finishes
const removeItem = (item) => item &&
new Promise((resolve, reject) =>
itemsApi(item)
.then((res) => resolve(res))
.catch(e => reject(e))
)
// Recursive function that removes a chunk of 4 items after the previous chunk has been removed
function removeChunk(chunk) {
// exit the function once there is no more items in the array
if (itemsToRemove.length === 0) return
console.log(itemsToRemove)
// after the first 4 request finish, keep making a new chunk of 4 requests until the itemsToRemove array is empty
Promise.allSettled(chunk.map(removeItem))
.then(() => removeChunk(itemsToRemove.splice(0, 4)))
}
const firstChunk = itemsToRemove.splice(0, 4)
// initiate the recursive function
removeChunk(firstChunk)
}
f(items)
I hope this answers your question
Related
I have an array that I am mapping, inside the map function I am calling an asynchronous function, that is performing an asynchronous request returning a promise using request-promise.
I am expecting the first item of the array be mapped, perform the request and then the second item repeats the same process. But that's not what is happening in this instance.
This is my function;
const fn = async() => {
const array = [0, 1, 2];
console.log('begin');
await Promise.all(array.map(item => anAsyncFunction(item)));
console.log('finished');
return;
}
anAsyncFunction is as follows;
const anAsyncFunction = async item => {
console.log(`looping ${item}`);
const awaitingRequest = await functionWithPromise(item);
console.log(`finished looping ${item}`);
return awaitingRequest;
}
And functionWithPromise where the request is made
const functionWithPromise = async (item) => {
console.log(`performing request for ${item}`);
return Promise.resolve(await request(`https://www.google.com/`).then(() => {
console.log(`finished performing request for ${item}`);
return item;
}));
}
From the console logs I get;
begin
looping 0
performing request for 0
looping 1
performing request for 1
looping 2
performing request for 2
finished performing request for 0
finished looping 0
finished performing request for 1
finished looping 1
finished performing request for 2
finished looping 2
finished
However, what I want is
begin
looping 0
performing request for 0
finished performing request for 0
finished looping 0
looping 1
performing request for 1
finished performing request for 1
finished looping 1
looping 2
performing request for 2
finished performing request for 2
finished looping 2
finished
I'd normally be fine with this pattern but I seem to be getting some invalid body from the request call as I may be making too many at once.
Is there a better method for what I am trying to achieve
.map() is not async or promise aware. It just dutifully takes the value you return from its callback and stuffs it in the result array. Even though it's a promise in your case, it still just keeps on going, not waiting for that promise. And, there is nothing you can do in that regard to change the .map() behavior. That's just the way it works.
Instead, use a for loop and then await your async function inside the loop and that will suspend the loop.
Your structure:
await Promise.all(array.map(item => anAsyncFunction(item)));
is running all the anAsyncFunction() calls in parallel and then waiting for all of them to finish.
To run them sequentially, use a for loop and await the individual function call:
const fn = async() => {
const array = [0, 1, 2];
console.log('begin');
for (let item of array) {
await anAsyncFunction(item);
}
console.log('finished');
return;
}
This is an important thing to know that none of the array iteration methods are async aware. That includes .map(), .filter(), .forEach(), etc... So, if you want to await something inside the loop in order to sequence your async operations, then use a regular for loop which is async aware and will pause the loop.
You can try replacing this line:
await Promise.all(array.map(item => anAsyncFunction(item)));
with:
await Promise.all(array.map(async(item) => await anAsyncFunction(item)));
Should work more nodejs way than for loop alternative, only forEach is to ban in this case.
I'm making requests to an API, but their server only allows a certain number of active connections, so I would like to limit the number of ongoing fetches. For my purposes, a fetch is only completed (not ongoing) when the HTTP response body arrives at the client.
I would like to create an abstraction like this:
const fetchLimiter = new FetchLimiter(maxConnections);
fetchLimiter.fetch(url, options); // Returns the same thing as fetch()
This would make things a lot simpler, but there seems to be no way of knowing when a stream being used by other code ends, because streams are locked while they are being read. It is possible to use ReadableStream.tee() to split the stream into two, use one and return the other to the caller (possibly also constructing a Response with it), but this would degrade performance, right?
Since fetch uses promises, you can take advantage of that to make a simple queue system.
This is a method I've used before for queuing promise based stuff. It enqueues items by creating a Promise and then adding its resolver to an array. Of course until that Promise resolves, the await keeps any later promises from being invoked.
And all we have to do to start the next fetch when one finishes is just grab the next resolver and invoke it. The promise resolves, and then the fetch starts!
Best part, since we don't actually consume the fetch result, there's no worries about having to clone or anything...we just pass it on intact, so that you can consume it in a later then or something.
*Edit: since the body is still streaming after the fetch promise resolves, I added a third option so that you can pass in the body type, and have FetchLimiter retrieve and parse the body for you.
These all return a promise that is eventually resolved with the actual content.
That way you can just have FetchLimiter parse the body for you. I made it so it would return an array of [response, data], that way you can still check things like the response code, headers, etc.
For that matter, you could even pass in a callback or something to use in that await if you needed to do something more complex, like different methods of parsing the body depending on response code.
Example
I added comments to indicate where the FetchLimiter code begins and ends...the rest is just demo code.
It's using a fake fetch using a setTimeout, which will resolve between 0.5-1.5 secs. It will start the first three requests immediately, and then the actives will be full, and it will wait for one to resolve.
When that happens, you'll see the comment that the promise has resolved, then the next promise in the queue will start, and then you'll see the then from in the for loop resolve. I added that then just so you could see the order of events.
(function() {
const fetch = (resource, init) => new Promise((resolve, reject) => {
console.log('starting ' + resource);
setTimeout(() => {
console.log(' - resolving ' + resource);
resolve(resource);
}, 500 + 1000 * Math.random());
});
function FetchLimiter() {
this.queue = [];
this.active = 0;
this.maxActive = 3;
this.fetchFn = fetch;
}
FetchLimiter.prototype.fetch = async function(resource, init, respType) {
// if at max active, enqueue the next request by adding a promise
// ahead of it, and putting the resolver in the "queue" array.
if (this.active >= this.maxActive) {
await new Promise(resolve => {
this.queue.push(resolve); // push, adds to end of array
});
}
this.active++; // increment active once we're about to start the fetch
const resp = await this.fetchFn(resource, init);
let data;
if (['arrayBuffer', 'blob', 'json', 'text', 'formData'].indexOf(respType) >= 0)
data = await resp[respType]();
this.active--; // decrement active once fetch is done
this.checkQueue(); // time to start the next fetch from queue
return [resp, data]; // return value from fetch
};
FetchLimiter.prototype.checkQueue = function() {
if (this.active < this.maxActive && this.queue.length) {
// shfit, pulls from start of array. This gives first in, first out.
const next = this.queue.shift();
next('resolved'); // resolve promise, value doesn't matter
}
}
const limiter = new FetchLimiter();
for (let i = 0; i < 9; i++) {
limiter.fetch('/mypage/' + i)
.then(x => console.log(' - .then ' + x));
}
})();
Caveats:
I'm not 100% sure if the body is still streaming when the promise resolves...that seems to be a concern for you. However if that's a problem you could use one of the Body mixin methods like blob or text or json, which doesn't resolve until the body content is completely parsed (see here)
I intentionally kept it very short (like 15 lines of actual code) as a very simple proof of concept. You'd want to add some error handling in production code, so that if the fetch rejects because of a connection error or something that you still decrement the active counter and start the next fetch.
Of course it's also using async/await syntax, because it's so much easier to read. If you need to support older browsers, you'd want to rewrite with Promises or transpile with babel or equivalent.
This code snippet is taken from google developers:
// map some URLs to json-promises
const jsonPromises = urls.map(async url => {
const response = await fetch(url);
return response.json();
});
I understand that at the end jsonPromises is going to be an array of Promises. However, i'm not completely sure how that happens.
According to my understanding, when execution gets to this line const response = await fetch(url); it moves it from call stack to Web Api, does the same to return response.json(); and moves to the next url from urls. Is it correct ? I understand how event loop works when it comes to simple setTimeout , but this example confuses me a lot.
Here's what happens:
1) The .map function runs, and executes the callback for each element (the async function). The function calls fetch, which starts some under the hood magic inside the engine. fetch returns a Promise. The await gets reached, which halts the functions execution. The call to the function done by .map evaluates to a promise. .map collects all those promises and stores them in an array and returns the array.
2) Somewhen, one of the fetches running under the hood has etablished a connection, and calls back into JS and resolves the fetch promise. That causes the async function call to go on. res.json() gets called, which causes the engine to collect all the packets from the connection and parse it as JSON under the hood. That again returns a Promise, which gets awaited, which causes the execution to stop again.
3) Somewhen, one of the connections end, and the whole response is available as JSON. The promise resolves, the async function call continues execution, returns, which causes the promise (the one in the array) to be resolved.
Since async functions return a promise, think of the example as any other promise-returning function:
// map happens synchronously and doesn't wait for anything
const map = (arr, cb) => {
const mapped = [];
for (const [i, element] of arr.entries()) {
mapped.push(cb(element))
}
return mapped;
}
// just return a promise that resolves in 5 seconds with the uppercased string
const promiseFn = url => {
return new Promise(resolve => setTimeout(() => resolve(url.toUpperCase()), 5000))
}
const promiseArray = map(['url', 'url', 'url'], promiseFn)
All the promises returned by the async callback are pushed synchronously. Their states are modified asynchronously.
If on the other hand you're looking for the difference between setTimeout and a Promise vis-a-vis the event loop, check the accepted answer here: What is the relationship between event loop and Promise
I'm implementing a query engine that mass fetches and processes requests. I am using async/await.
Right now the flow of execution runs in a hierarchy where there is a list of items containing queries, and each of those queries have a fetch.
What I am trying to do is bundle the items in groups of n, so even if each of them have m queries with fetches inside, only n*m requests run simultaneously; and specially only one request will be made simultaneously to the same domain.
The problem is, when I await the execution of the items (at the outer level, in a while that groups items and will stop iterations until the promises resolve), those promises are resolving when the execution of an inner query is deferred because of the inner await of the fetch.
That causes my queuing while to only stop momentarily, instead of awaiting for the inner promises to resolve to.
This is the outer, queuing class:
class AsyncItemQueue {
constructor(items, concurrency) {
this.items = items;
this.concurrency = concurrency;
}
run = async () => {
let itemPromises = [];
const bundles = Math.ceil(this.items.length / this.concurrency);
let currentBundle = 0;
while (currentBundle < bundles) {
console.log(`<--------- FETCHING ITEM BUNDLE ${currentBundle} OF ${bundles} --------->`);
const lowerRange = currentBundle * this.concurrency;
const upperRange = (currentBundle + 1) * this.concurrency;
itemPromises.push(
this.items.slice(lowerRange, upperRange).map(item => item.run())
);
await Promise.all(itemPromises);
currentBundle++;
}
};
}
export default AsyncItemQueue;
This is the simple item class that queue is running. I'm omitting superfluous code.
class Item {
// ...
run = async () => {
console.log('Item RUN', this, this.name);
return await Promise.all(this.queries.map(query => {
const itemPromise = query.run(this.name);
return itemPromise;
}));
}
}
And this is the queries contained inside items. Every item has a list of queries. Again, some code is removed as it's not interesting.
class Query {
// ...
run = async (item) => {
// Step 1: If requisites, await.
if (this.requires) {
await this.savedData[this.requires];
}
// Step 2: Resolve URL.
this.resolveUrl(item);
// Step 3: If provides, create promise in savedData.
const fetchPromise = this.fetch();
if (this.saveData) {
this.saveData.forEach(sd => (this.savedData[sd] = fetchPromise));
}
// Step 4: Fetch.
const document = await fetchPromise;
// ...
}
}
The while in AsyncItemQueue is stopping correctly, but only until the execution flow reaches step 3 in Query. As soon as it reaches that fetch, which is a wrapper for the standard fetch functions, the outer promise resolves, and I end up with all the requests being performed at the same time.
I suspect the problem is somewhere in the Query class, but I am stumped as to how to avoid the resolution of the outer promise.
I tried making the Query class run function return document, just in case, but to no avail.
Any idea or guidance would be greatly appreciated. I'll try to answer any questions about the code or provide more if needed.
Thanks!
PS: Here is a codesandbox with a working example: https://codesandbox.io/s/goofy-tesla-iwzem
As you can see in the console exit, the while loop is iterating before the fetches finalize, and they are all being performed at the same time.
I've solved it.
The problem was in the AsyncItemQueue class. Specifically:
itemPromises.push(
this.items.slice(lowerRange, upperRange).map(item => item.run())
);
That was pushing a list of promises into the list, and so, later on:
await Promise.all(itemPromises);
Did not find any promises to wait in that list (because it contained more lists, with promises inside).
The solution was to change the code to:
await Promise.all(this.items.slice(lowerRange, upperRange).map(item => item.run()));
Now it is working perfectly. Items are being run in batches of n, and a new batch will not run until the previous has finished.
I'm not sure this will help anyone but me, but I'll leave it here in case somebody finds a similar problem someday. Thanks for the help.
Take the following loop:
for(var i=0; i<100; ++i){
let result = await some_slow_async_function();
do_something_with_result();
}
Does await block the loop? Or does the i continue to be incremented while awaiting?
Is the order of do_something_with_result() guaranteed sequential with regard to i? Or does it depend on how fast the awaited function is for each i?
Does await block the loop? Or does the i continue to be incremented while awaiting?
"Block" is not the right word, but yes, i does not continue to be incremented while awaiting. Instead the execution jumps back to where the async function was called, providing a promise as return value, continuing the rest of the code that follows after the function call, until the code stack has been emptied. Then when the awaiting is over, the state of the function is restored, and execution continues within that function. Whenever that function returns (completes), the corresponding promise -- that was returned earlier on -- is resolved.
Is the order of do_something_with_result() guaranteed sequential with regard to i? Or does it depend on how fast the awaited function is for each i?
The order is guaranteed. The code following the await is also guaranteed to execute only after the call stack has been emptied, i.e. at least on or after the next microtask can execute.
See how the output is in this snippet. Note especially where it says "after calling test":
async function test() {
for (let i = 0; i < 2; i++) {
console.log('Before await for ', i);
let result = await Promise.resolve(i);
console.log('After await. Value is ', result);
}
}
test().then(_ => console.log('After test() resolved'));
console.log('After calling test');
As #realbart says, it does block the loop, which then will make the calls sequential.
If you want to trigger a ton of awaitable operations and then handle them all together, you could do something like this:
const promisesToAwait = [];
for (let i = 0; i < 100; i++) {
promisesToAwait.push(fetchDataForId(i));
}
const responses = await Promise.all(promisesToAwait);
You can test async/await inside a "FOR LOOP" like this:
(async () => {
for (let i = 0; i < 100; i++) {
await delay();
console.log(i);
}
})();
function delay() {
return new Promise((resolve, reject) => {
setTimeout(resolve, 100);
});
}
async functions return a Promise, which is an object that will eventually "resolve" to a value, or "reject" with an error. The await keyword means to wait until this value (or error) has been finalized.
So from the perspective of the running function, it blocks waiting for the result of the slow async function. The javascript engine, on the other hand, sees that this function is blocked waiting for the result, so it will go check the event loop (ie. new mouse clicks, or connection requests, etc.) to see if there are any other things it can work on until the results are returned.
Note however, that if the slow async function is slow because it is computing lots of stuff in your javascript code, the javascript engine won't have lots of resources to do other stuff (and by doing other stuff would likely make the slow async function even slower). Where the benefit of async functions really shine is for I/O intensive operations like querying a database or transmitting a large file where the javascript engine is well and truly waiting on something else (ie. database, filesystem, etc.).
The following two bits of code are functionally equivalent:
let result = await some_slow_async_function();
and
let promise = some_slow_async_function(); // start the slow async function
// you could do other stuff here while the slow async function is running
let result = await promise; // wait for the final value from the slow async function
In the second example above the slow async function is called without the await keyword, so it will start execution of the function and return a promise. Then you can do other things (if you have other things to do). Then the await keyword is used to block until the promise actually "resolves". So from the perspective of the for loop it will run synchronous.
So:
yes, the await keyword has the effect of blocking the running function until the async function either "resolves" with a value or "rejects" with an error, but it does not block the javascript engine, which can still do other things if it has other things to do while awaiting
yes, the execution of the loop will be sequential
There is an awesome tutorial about all this at http://javascript.info/async.
No Event loop isn't blocked, see example below
function sayHelloAfterSomeTime (ms) {
return new Promise((resolve, reject) => {
if (typeof ms !== 'number') return reject('ms must be a number')
setTimeout(() => {
console.log('Hello after '+ ms / 1000 + ' second(s)')
resolve()
}, ms)
})
}
async function awaitGo (ms) {
await sayHelloAfterSomeTime(ms).catch(e => console.log(e))
console.log('after awaiting for saying Hello, i can do another things ...')
}
function notAwaitGo (ms) {
sayHelloAfterSomeTime(ms).catch(e => console.log(e))
console.log('i dont wait for saying Hello ...')
}
awaitGo(1000)
notAwaitGo(1000)
console.log('coucou i am event loop and i am not blocked ...')
Here is my test solution about this interesting question:
import crypto from "crypto";
function diyCrypto() {
return new Promise((resolve, reject) => {
crypto.pbkdf2('secret', 'salt', 2000000, 64, 'sha512', (err, res) => {
if (err) {
reject(err)
return
}
resolve(res.toString("base64"))
})
})
}
setTimeout(async () => {
console.log("before await...")
const a = await diyCrypto();
console.log("after await...", a)
}, 0);
setInterval(() => {
console.log("test....")
}, 200);
Inside the setTimeout's callback the await blocks the execution. But the setInterval is keep runnning, so the Event Loop is running as usual.
Let me clarify a bit because some answers here have some wrong information about how Promise execution works, specifically when related to the event loop.
In the case of the example, await will block the loop. do_something_with_result() will not be called until await finishes it's scheduled job.
https://developer.mozilla.org/en-US/docs/Learn/JavaScript/Asynchronous/Async_await#handling_asyncawait_slowdown
As for the other points, Promise "jobs" run before the next event loop cycle, as microtasks. When you call Promise.then() or the resolve() function inside new Promise((resolve) => {}), you creating a Job. Both await and async are wrapper, of sorts, for Promise, that will both create a Job. Microtasks are meant to run before the next event loop cycle. That means adding a Promise Job means more work before it can move on to the next event loop cycle.
Here's an example how you can lock up your event loop because your promises (Jobs) take too long.
let tick = 0;
let time = performance.now();
setTimeout(() => console.log('Hi from timeout'), 0);
const tock = () => console.log(tick++);
const longTask = async () => {
console.log('begin task');
for(let i = 0; i < 1_000_000_000; i++) {
Math.sqrt(i);
}
console.log('done task');
}
requestAnimationFrame(()=> console.log('next frame after', performance.now() - time, 'ms'));
async function run() {
await tock();
await tock();
await longTask(); // Will stall your UI
await tock(); // Will execute even though it's already dropped frames
await tock(); // This will execute too
}
run();
// Promise.resolve().then(tock).then(tock).then(longTask).then(tock).then(tock);
In this sample, 5 total promises are created. 2 calls for tock, 1 for longTask and then 2 calls for tock. All 5 will run before the next event loop.
The execution would be:
Start JS execution
Execute normal script
Run 5 scheduled Promise jobs
End JS execution
Event Loop Cycle Start
Request Animation Frame fire
Timeout fire
The last line commented line is scheduling without async/await and results in the same.
Basically, you will stall the next event loop cycle unless you tell your JS execution where it can suspend. Your Promise jobs will continue to run in the current event loop run until it finishes its call stack. When you call something external, (like fetch), then it's likely using letting the call stack end and has a callback that will resolve the pending Promise. Like this:
function waitForClick() {
return new Promise((resolve) => {
// Use an event as a callback;
button.onclick = () => resolve();
// Let the call stack finish by implicitly not returning anything, or explicitly returning `undefined` (same thing).
// return undefined;
})
}
If you have a long job job that want to complete, either use a Web Worker to run it without pausing, or insert some pauses with something like setTimeout() or setImmediate().
Reshaping the longTask function, you can do something like this:
const longTask = async () => {
console.log('begin task');
for(let i = 0; i < 1_000_000_000; i++)
if (i && i % (10_000_000) === 0) {
await new Promise((r) => setTimeout(r,0));
}
Math.sqrt(i);
console.log('done task');
}
Basically, instead of doing 1 billion records in one shot, you only do 10 million and then wait until the next event (setTimeout) to run the next one. The bad here is it's slower because of how much you hand back to the event loop. Instead, you can use requestIdleCallback() which is better, but still not as good as multi-threading via Web Workers.
But be aware that just slapping on await or Promise.resolve().then() around a function won't help with the event loop. Both will wait until the function returns with either a Promise or a value before letting up for the event loop. You can mostly test by checking to see if the function you're calling returns an unresolved Promise immediately.
Does await block the loop? Or does the i continue to be incremented while awaiting?
No, await won't block the looping. Yes, i continues to be incremented while looping.
Is the order of do_something_with_result() guaranteed sequential with regard to i? Or does it depend on how fast the awaited function is for each i?
Order of do_something_with_result() is guaranteed sequentially but not with regards to i. It depends on how fast the awaited function runs.
All calls to some_slow_async_function() are batched, i.e., if do_something_with_result() was a console then we will see it printed the number of times the loop runs. And then sequentially, after this, all the await calls will be executed.
To better understand you can run below code snippet:
async function someFunction(){
for (let i=0;i<5;i++){
await callAPI();
console.log('After', i, 'th API call');
}
console.log("All API got executed");
}
function callAPI(){
setTimeout(()=>{
console.log("I was called at: "+new Date().getTime())}, 1000);
}
someFunction();
One can clearly see how line console.log('After', i, 'th API call'); gets printed first for entire stretch of the for loop and then at the end when all code is executed we get results from callAPI().
So if lines after await were dependent on result obtained from await calls then they will not work as expected.
To conclude, await in for-loop does not ensure successful operation on result obtained from await calls which might take some time to finish.
In node, if one uses neo-async library with waterfall, one can achieve this.