How to yield value multiple times from function? - javascript

So what I am doing is, I have 2 files, One that contain a script which would generate a token and the second file handle that token.
The problem is that the second script which would log the token it would only log the first token received.
This is the how I am handling the token:
const first_file = require("./first_file.js");
first_file.first_file().then((res) => {
console.log(res);
});
And clearly that wouldn't work, Because it's not getting updated with the newer value.
first_file = async () => {
return new Promise(async (resolve, reject) => {
//Generating the token
(async () => {
while (true) {
console.log("Resolving...");
resolve(token);
await sleep(5000);
resolved_token = token;
}
})();
});
};
module.exports = { first_file };
What I am doing here is, I tried to do a while..loop so that I keep resolving the token. But it didn't, Is there and way I can export the variable directly so the task would be easier ?

If I understand your question correctly, You want to resolve promise multiple times, And It's nothing to do with modules...
But You understand something wrong about promise in JavaScript...
You can't resolve a promise twice.
Generator
But you can generate new value from function, this type of function also known as generator, Where a function can reenter its context (Something like async/await) and yield result using yield keyword.
Usually a generator is used in for..of loop. It has next() method for yield next value from a generator...
Lets look an example:
const delay = ms => new Promise(res => setTimeout(res.bind(null, ms), ms));
async function* generator() {
yield 'yield result from generator!'
for (let ms = 100; ms <= 300; ms += 100) {
yield 'delay: ' + await delay(ms) + ' ms';
}
yield delay(1000).then(() => 'you can also yield promise!');
}
async function main() {
const gen = generator();
console.log('1st', (await gen.next()).value);
for await (const ms of gen) {
console.log(ms)
}
}
main()
Note that * after function, So that we know that this function a generator, with async keyword this is Async Generator.
Generator is very useful. like: Generate value on demand, Pass data like pipe!, Can return endless value from function etc...
Callback
This old school method heavily used in node, Where you pass a callback function as argument.
Example:
const delay = ms => new Promise(res => setTimeout(res.bind(null, ms), ms));
async function callback(fn) {
fn('yield result from callback!');
for (let ms = 100; ms <= 300; ms += 100) {
fn('delay: ' + await delay(ms) + ' ms');
}
await delay(1000);
fn('yield asynchronously!');
}
callback(value => console.log(value));
This approach create all sort of nutsy problem, like: created function scope, disaster of control flow, doesn't have break keyword etc...
I don't recommend this method.

Related

How can I add setTimeout() to promise of mapped async API fetch calls? [duplicate]

I have some code that is iterating over a list that was queried out of a database and making an HTTP request for each element in that list. That list can sometimes be a reasonably large number (in the thousands), and I would like to make sure I am not hitting a web server with thousands of concurrent HTTP requests.
An abbreviated version of this code currently looks something like this...
function getCounts() {
return users.map(user => {
return new Promise(resolve => {
remoteServer.getCount(user) // makes an HTTP request
.then(() => {
/* snip */
resolve();
});
});
});
}
Promise.all(getCounts()).then(() => { /* snip */});
This code is running on Node 4.3.2. To reiterate, can Promise.all be managed so that only a certain number of Promises are in progress at any given time?
P-Limit
I have compared promise concurrency limitation with a custom script, bluebird, es6-promise-pool, and p-limit. I believe that p-limit has the most simple, stripped down implementation for this need. See their documentation.
Requirements
To be compatible with async in example
ECMAScript 2017 (version 8)
Node version > 8.2.1
My Example
In this example, we need to run a function for every URL in the array (like, maybe an API request). Here this is called fetchData(). If we had an array of thousands of items to process, concurrency would definitely be useful to save on CPU and memory resources.
const pLimit = require('p-limit');
// Example Concurrency of 3 promise at once
const limit = pLimit(3);
let urls = [
"http://www.exampleone.com/",
"http://www.exampletwo.com/",
"http://www.examplethree.com/",
"http://www.examplefour.com/",
]
// Create an array of our promises using map (fetchData() returns a promise)
let promises = urls.map(url => {
// wrap the function we are calling in the limit function we defined above
return limit(() => fetchData(url));
});
(async () => {
// Only three promises are run at once (as defined above)
const result = await Promise.all(promises);
console.log(result);
})();
The console log result is an array of your resolved promises response data.
Using Array.prototype.splice
while (funcs.length) {
// 100 at a time
await Promise.all( funcs.splice(0, 100).map(f => f()) )
}
Note that Promise.all() doesn't trigger the promises to start their work, creating the promise itself does.
With that in mind, one solution would be to check whenever a promise is resolved whether a new promise should be started or whether you're already at the limit.
However, there is really no need to reinvent the wheel here. One library that you could use for this purpose is es6-promise-pool. From their examples:
var PromisePool = require('es6-promise-pool')
var promiseProducer = function () {
// Your code goes here.
// If there is work left to be done, return the next work item as a promise.
// Otherwise, return null to indicate that all promises have been created.
// Scroll down for an example.
}
// The number of promises to process simultaneously.
var concurrency = 3
// Create a pool.
var pool = new PromisePool(promiseProducer, concurrency)
// Start the pool.
var poolPromise = pool.start()
// Wait for the pool to settle.
poolPromise.then(function () {
console.log('All promises fulfilled')
}, function (error) {
console.log('Some promise rejected: ' + error.message)
})
If you know how iterators work and how they are consumed you would't need any extra library, since it can become very easy to build your own concurrency yourself. Let me demonstrate:
/* [Symbol.iterator]() is equivalent to .values()
const iterator = [1,2,3][Symbol.iterator]() */
const iterator = [1,2,3].values()
// loop over all items with for..of
for (const x of iterator) {
console.log('x:', x)
// notices how this loop continues the same iterator
// and consumes the rest of the iterator, making the
// outer loop not logging any more x's
for (const y of iterator) {
console.log('y:', y)
}
}
We can use the same iterator and share it across workers.
If you had used .entries() instead of .values() you would have gotten a iterator that yields [index, value] which i will demonstrate below with a concurrency of 2
const sleep = t => new Promise(rs => setTimeout(rs, t))
const iterator = Array.from('abcdefghij').entries()
// const results = [] || Array(someLength)
async function doWork (iterator, i) {
for (let [index, item] of iterator) {
await sleep(1000)
console.log(`Worker#${i}: ${index},${item}`)
// in case you need to store the results in order
// results[index] = item + item
// or if the order dose not mather
// results.push(item + item)
}
}
const workers = Array(2).fill(iterator).map(doWork)
// ^--- starts two workers sharing the same iterator
Promise.allSettled(workers).then(console.log.bind(null, 'done'))
The benefit of this is that you can have a generator function instead of having everything ready at once.
What's even more awesome is that you can do stream.Readable.from(iterator) in node (and eventually in whatwg streams as well). and with transferable ReadbleStream, this makes this potential very useful in the feature if you are working with web workers also for performances
Note: the different from this compared to example async-pool is that it spawns two workers, so if one worker throws an error for some reason at say index 5 it won't stop the other worker from doing the rest. So you go from doing 2 concurrency down to 1. (so it won't stop there) So my advise is that you catch all errors inside the doWork function
Instead of using promises for limiting http requests, use node's built-in http.Agent.maxSockets. This removes the requirement of using a library or writing your own pooling code, and has the added advantage more control over what you're limiting.
agent.maxSockets
By default set to Infinity. Determines how many concurrent sockets the agent can have open per origin. Origin is either a 'host:port' or 'host:port:localAddress' combination.
For example:
var http = require('http');
var agent = new http.Agent({maxSockets: 5}); // 5 concurrent connections per origin
var request = http.request({..., agent: agent}, ...);
If making multiple requests to the same origin, it might also benefit you to set keepAlive to true (see docs above for more info).
bluebird's Promise.map can take a concurrency option to control how many promises should be running in parallel. Sometimes it is easier than .all because you don't need to create the promise array.
const Promise = require('bluebird')
function getCounts() {
return Promise.map(users, user => {
return new Promise(resolve => {
remoteServer.getCount(user) // makes an HTTP request
.then(() => {
/* snip */
resolve();
});
});
}, {concurrency: 10}); // <---- at most 10 http requests at a time
}
As all others in this answer thread have pointed out, Promise.all() won't do the right thing if you need to limit concurrency. But ideally you shouldn't even want to wait until all of the Promises are done before processing them.
Instead, you want to process each result ASAP as soon as it becomes available, so you don't have to wait for the very last promise to finish before you start iterating over them.
So, here's a code sample that does just that, based partly on the answer by Endless and also on this answer by T.J. Crowder.
// example tasks that sleep and return a number
// in real life, you'd probably fetch URLs or something
const tasks = [];
for (let i = 0; i < 20; i++) {
tasks.push(async () => {
console.log(`start ${i}`);
await sleep(Math.random() * 1000);
console.log(`end ${i}`);
return i;
});
}
function sleep(ms) { return new Promise(r => setTimeout(r, ms)); }
(async () => {
for await (let value of runTasks(3, tasks.values())) {
console.log(`output ${value}`);
}
})();
async function* runTasks(maxConcurrency, taskIterator) {
// Each async iterator is a worker, polling for tasks from the shared taskIterator
// Sharing the iterator ensures that each worker gets unique tasks.
const asyncIterators = new Array(maxConcurrency);
for (let i = 0; i < maxConcurrency; i++) {
asyncIterators[i] = (async function* () {
for (const task of taskIterator) yield await task();
})();
}
yield* raceAsyncIterators(asyncIterators);
}
async function* raceAsyncIterators(asyncIterators) {
async function nextResultWithItsIterator(iterator) {
return { result: await iterator.next(), iterator: iterator };
}
/** #type Map<AsyncIterator<T>,
Promise<{result: IteratorResult<T>, iterator: AsyncIterator<T>}>> */
const promises = new Map(asyncIterators.map(iterator =>
[iterator, nextResultWithItsIterator(iterator)]));
while (promises.size) {
const { result, iterator } = await Promise.race(promises.values());
if (result.done) {
promises.delete(iterator);
} else {
promises.set(iterator, nextResultWithItsIterator(iterator));
yield result.value;
}
}
}
There's a lot of magic in here; let me explain.
This solution is built around async generator functions, which many JS developers may not be familiar with.
A generator function (aka function* function) returns a "generator," an iterator of results. Generator functions are allowed to use the yield keyword where you might have normally used a return keyword. The first time a caller calls next() on the generator (or uses a for...of loop), the function* function runs until it yields a value; that becomes the next() value of the iterator. But the subsequent time next() is called, the generator function resumes from the yield statement, right where it left off, even if it's in the middle of a loop. (You can also yield*, to yield all of the results of another generator function.)
An "async generator function" (async function*) is a generator function that returns an "async iterator," which is an iterator of promises. You can call for await...of on an async iterator. Async generator functions can use the await keyword, as you might do in any async function.
In the example, we call runTasks() with an array of task functions. runTasks() is an async generator function, so we can call it with a for await...of loop. Each time the loop runs, we'll process the result of the latest completed task.
runTasks() creates N async iterators, the workers. (Note that the workers are initially defined as async generator functions, but we immediately invoke each function, and store each resulting async iterator in the asyncIterators array.) The example calls runTasks with 3 concurrent workers, so no more than 3 tasks are launched at the same time. When any task completes, we immediately queue up the next task. (This is superior to "batching", where you do 3 tasks at once, await all three of them, and don't start the next batch of three until the entire previous batch has finished.)
runTasks() concludes by "racing" its async iterators with yield* raceAsyncIterators(). raceAsyncIterators() is like Promise.race() but it races N iterators of Promises instead of just N Promises; it returns an async iterator that yields the results of resolved Promises.
raceAsyncIterators() starts by defining a promises Map from each of the iterators to promises. Each promise is a promise for an iteration result along with the iterator that generated it.
With the promises map, we can Promise.race() the values of the map, giving us the winning iteration result and its iterator. If the iterator is completely done, we remove it from the map; otherwise we replace its Promise in the promises map with the iterator's next() Promise and yield result.value.
In conclusion, runTasks() is an async generator function that yields the results of racing N concurrent async iterators of tasks, so the end user can just for await (let value of runTasks(3, tasks.values())) to process each result as soon as it becomes available.
I suggest the library async-pool: https://github.com/rxaviers/async-pool
npm install tiny-async-pool
Description:
Run multiple promise-returning & async functions with limited concurrency using native ES6/ES7
asyncPool runs multiple promise-returning & async functions in a limited concurrency pool. It rejects immediately as soon as one of the promises rejects. It resolves when all the promises completes. It calls the iterator function as soon as possible (under concurrency limit).
Usage:
const timeout = i => new Promise(resolve => setTimeout(() => resolve(i), i));
await asyncPool(2, [1000, 5000, 3000, 2000], timeout);
// Call iterator (i = 1000)
// Call iterator (i = 5000)
// Pool limit of 2 reached, wait for the quicker one to complete...
// 1000 finishes
// Call iterator (i = 3000)
// Pool limit of 2 reached, wait for the quicker one to complete...
// 3000 finishes
// Call iterator (i = 2000)
// Itaration is complete, wait until running ones complete...
// 5000 finishes
// 2000 finishes
// Resolves, results are passed in given array order `[1000, 5000, 3000, 2000]`.
Here is my ES7 solution to a copy-paste friendly and feature complete Promise.all()/map() alternative, with a concurrency limit.
Similar to Promise.all() it maintains return order as well as a fallback for non promise return values.
I also included a comparison of the different implementation as it illustrates some aspects a few of the other solutions have missed.
Usage
const asyncFn = delay => new Promise(resolve => setTimeout(() => resolve(), delay));
const args = [30, 20, 15, 10];
await asyncPool(args, arg => asyncFn(arg), 4); // concurrency limit of 4
Implementation
async function asyncBatch(args, fn, limit = 8) {
// Copy arguments to avoid side effects
args = [...args];
const outs = [];
while (args.length) {
const batch = args.splice(0, limit);
const out = await Promise.all(batch.map(fn));
outs.push(...out);
}
return outs;
}
async function asyncPool(args, fn, limit = 8) {
return new Promise((resolve) => {
// Copy arguments to avoid side effect, reverse queue as
// pop is faster than shift
const argQueue = [...args].reverse();
let count = 0;
const outs = [];
const pollNext = () => {
if (argQueue.length === 0 && count === 0) {
resolve(outs);
} else {
while (count < limit && argQueue.length) {
const index = args.length - argQueue.length;
const arg = argQueue.pop();
count += 1;
const out = fn(arg);
const processOut = (out, index) => {
outs[index] = out;
count -= 1;
pollNext();
};
if (typeof out === 'object' && out.then) {
out.then(out => processOut(out, index));
} else {
processOut(out, index);
}
}
}
};
pollNext();
});
}
Comparison
// A simple async function that returns after the given delay
// and prints its value to allow us to determine the response order
const asyncFn = delay => new Promise(resolve => setTimeout(() => {
console.log(delay);
resolve(delay);
}, delay));
// List of arguments to the asyncFn function
const args = [30, 20, 15, 10];
// As a comparison of the different implementations, a low concurrency
// limit of 2 is used in order to highlight the performance differences.
// If a limit greater than or equal to args.length is used the results
// would be identical.
// Vanilla Promise.all/map combo
const out1 = await Promise.all(args.map(arg => asyncFn(arg)));
// prints: 10, 15, 20, 30
// total time: 30ms
// Pooled implementation
const out2 = await asyncPool(args, arg => asyncFn(arg), 2);
// prints: 20, 30, 15, 10
// total time: 40ms
// Batched implementation
const out3 = await asyncBatch(args, arg => asyncFn(arg), 2);
// prints: 20, 30, 20, 30
// total time: 45ms
console.log(out1, out2, out3); // prints: [30, 20, 15, 10] x 3
// Conclusion: Execution order and performance is different,
// but return order is still identical
Conclusion
asyncPool() should be the best solution as it allows new requests to start as soon as any previous one finishes.
asyncBatch() is included as a comparison as its implementation is simpler to understand, but it should be slower in performance as all requests in the same batch is required to finish in order to start the next batch.
In this contrived example, the non-limited vanilla Promise.all() is of course the fastest, while the others could perform more desirable in a real world congestion scenario.
Update
The async-pool library that others have already suggested is probably a better alternative to my implementation as it works almost identically and has a more concise implementation with a clever usage of Promise.race(): https://github.com/rxaviers/async-pool/blob/master/lib/es7.js
Hopefully my answer can still serve an educational value.
Semaphore is well known concurrency primitive that was designed to solve similar problems. It's very universal construct, implementations of Semaphore exist in many languages. This is how one would use Semaphore to solve this issue:
async function main() {
const s = new Semaphore(100);
const res = await Promise.all(
entities.map((users) =>
s.runExclusive(() => remoteServer.getCount(user))
)
);
return res;
}
I'm using implementation of Semaphore from async-mutex, it has decent documentation and TypeScript support.
If you want to dig deep into topics like this you can take a look at the book "The Little Book of Semaphores" which is freely available as PDF here
Unfortunately there is no way to do it with native Promise.all, so you have to be creative.
This is the quickest most concise way I could find without using any outside libraries.
It makes use of a newer javascript feature called an iterator. The iterator basically keeps track of what items have been processed and what haven't.
In order to use it in code, you create an array of async functions. Each async function asks the same iterator for the next item that needs to be processed. Each function processes its own item asynchronously, and when done asks the iterator for a new one. Once the iterator runs out of items, all the functions complete.
Thanks to #Endless for inspiration.
const items = [
'https://httpbin.org/bytes/2',
'https://httpbin.org/bytes/2',
'https://httpbin.org/bytes/2',
'https://httpbin.org/bytes/2',
'https://httpbin.org/bytes/2',
'https://httpbin.org/bytes/2',
'https://httpbin.org/bytes/2',
'https://httpbin.org/bytes/2'
]
// get a cursor that keeps track of what items have already been processed.
let cursor = items.entries();
// create 5 for loops that each run off the same cursor which keeps track of location
Array(5).fill().forEach(async () => {
for (let [index, url] of cursor){
console.log('getting url is ', index, url)
// run your async task instead of this next line
var text = await fetch(url).then(res => res.text())
console.log('text is', text.slice(0, 20))
}
})
Here goes basic example for streaming and 'p-limit'. It streams http read stream to mongo db.
const stream = require('stream');
const util = require('util');
const pLimit = require('p-limit');
const es = require('event-stream');
const streamToMongoDB = require('stream-to-mongo-db').streamToMongoDB;
const pipeline = util.promisify(stream.pipeline)
const outputDBConfig = {
dbURL: 'yr-db-url',
collection: 'some-collection'
};
const limit = pLimit(3);
async yrAsyncStreamingFunction(readStream) => {
const mongoWriteStream = streamToMongoDB(outputDBConfig);
const mapperStream = es.map((data, done) => {
let someDataPromise = limit(() => yr_async_call_to_somewhere())
someDataPromise.then(
function handleResolve(someData) {
data.someData = someData;
done(null, data);
},
function handleError(error) {
done(error)
}
);
})
await pipeline(
readStream,
JSONStream.parse('*'),
mapperStream,
mongoWriteStream
);
}
So many good solutions. I started out with the elegant solution posted by #Endless and ended up with this little extension method that does not use any external libraries nor does it run in batches (although assumes you have features like async, etc):
Promise.allWithLimit = async (taskList, limit = 5) => {
const iterator = taskList.entries();
let results = new Array(taskList.length);
let workerThreads = new Array(limit).fill(0).map(() =>
new Promise(async (resolve, reject) => {
try {
let entry = iterator.next();
while (!entry.done) {
let [index, promise] = entry.value;
try {
results[index] = await promise;
entry = iterator.next();
}
catch (err) {
results[index] = err;
}
}
// No more work to do
resolve(true);
}
catch (err) {
// This worker is dead
reject(err);
}
}));
await Promise.all(workerThreads);
return results;
};
Promise.allWithLimit = async (taskList, limit = 5) => {
const iterator = taskList.entries();
let results = new Array(taskList.length);
let workerThreads = new Array(limit).fill(0).map(() =>
new Promise(async (resolve, reject) => {
try {
let entry = iterator.next();
while (!entry.done) {
let [index, promise] = entry.value;
try {
results[index] = await promise;
entry = iterator.next();
}
catch (err) {
results[index] = err;
}
}
// No more work to do
resolve(true);
}
catch (err) {
// This worker is dead
reject(err);
}
}));
await Promise.all(workerThreads);
return results;
};
const demoTasks = new Array(10).fill(0).map((v,i) => new Promise(resolve => {
let n = (i + 1) * 5;
setTimeout(() => {
console.log(`Did nothing for ${n} seconds`);
resolve(n);
}, n * 1000);
}));
var results = Promise.allWithLimit(demoTasks);
#tcooc's answer was quite cool. Didn't know about it and will leverage it in the future.
I also enjoyed #MatthewRideout's answer, but it uses an external library!!
Whenever possible, I give a shot at developing this kind of things on my own, rather than going for a library. You end up learning a lot of concepts which seemed daunting before.
class Pool{
constructor(maxAsync) {
this.maxAsync = maxAsync;
this.asyncOperationsQueue = [];
this.currentAsyncOperations = 0
}
runAnother() {
if (this.asyncOperationsQueue.length > 0 && this.currentAsyncOperations < this.maxAsync) {
this.currentAsyncOperations += 1;
this.asyncOperationsQueue.pop()()
.then(() => { this.currentAsyncOperations -= 1; this.runAnother() }, () => { this.currentAsyncOperations -= 1; this.runAnother() })
}
}
add(f){ // the argument f is a function of signature () => Promise
this.runAnother();
return new Promise((resolve, reject) => {
this.asyncOperationsQueue.push(
() => f().then(resolve).catch(reject)
)
})
}
}
//#######################################################
// TESTS
//#######################################################
function dbCall(id, timeout, fail) {
return new Promise((resolve, reject) => {
setTimeout(() => {
if (fail) {
reject(`Error for id ${id}`);
} else {
resolve(id);
}
}, timeout)
}
)
}
const dbQuery1 = () => dbCall(1, 5000, false);
const dbQuery2 = () => dbCall(2, 5000, false);
const dbQuery3 = () => dbCall(3, 5000, false);
const dbQuery4 = () => dbCall(4, 5000, true);
const dbQuery5 = () => dbCall(5, 5000, false);
const cappedPool = new Pool(2);
const dbQuery1Res = cappedPool.add(dbQuery1).catch(i => i).then(i => console.log(`Resolved: ${i}`))
const dbQuery2Res = cappedPool.add(dbQuery2).catch(i => i).then(i => console.log(`Resolved: ${i}`))
const dbQuery3Res = cappedPool.add(dbQuery3).catch(i => i).then(i => console.log(`Resolved: ${i}`))
const dbQuery4Res = cappedPool.add(dbQuery4).catch(i => i).then(i => console.log(`Resolved: ${i}`))
const dbQuery5Res = cappedPool.add(dbQuery5).catch(i => i).then(i => console.log(`Resolved: ${i}`))
This approach provides a nice API, similar to thread pools in scala/java.
After creating one instance of the pool with const cappedPool = new Pool(2), you provide promises to it with simply cappedPool.add(() => myPromise).
Obliviously we must ensure that the promise does not start immediately and that is why we must "provide it lazily" with the help of the function.
Most importantly, notice that the result of the method add is a Promise which will be completed/resolved with the value of your original promise! This makes for a very intuitive use.
const resultPromise = cappedPool.add( () => dbCall(...))
resultPromise
.then( actualResult => {
// Do something with the result form the DB
}
)
This solution uses an async generator to manage concurrent promises with vanilla javascript. The throttle generator takes 3 arguments:
An array of values to be be supplied as arguments to a promise genrating function. (e.g. An array of URLs.)
A function that return a promise. (e.g. Returns a promise for an HTTP request.)
An integer that represents the maximum concurrent promises allowed.
Promises are only instantiated as required in order to reduce memory consumption. Results can be iterated over using a for await...of statement.
The example below provides a function to check promise state, the throttle async generator, and a simple function that return a promise based on setTimeout. The async IIFE at the end defines the reservoir of timeout values, sets the async iterable returned by throttle, then iterates over the results as they resolve.
If you would like a more complete example for HTTP requests, let me know in the comments.
Please note that Node.js 16+ is required in order async generators.
const promiseState = function( promise ) {
const control = Symbol();
return Promise
.race([ promise, control ])
.then( value => ( value === control ) ? 'pending' : 'fulfilled' )
.catch( () => 'rejected' );
}
const throttle = async function* ( reservoir, promiseClass, highWaterMark ) {
let iterable = reservoir.splice( 0, highWaterMark ).map( item => promiseClass( item ) );
while ( iterable.length > 0 ) {
await Promise.any( iterable );
const pending = [];
const resolved = [];
for ( const currentValue of iterable ) {
if ( await promiseState( currentValue ) === 'pending' ) {
pending.push( currentValue );
} else {
resolved.push( currentValue );
}
}
console.log({ pending, resolved, reservoir });
iterable = [
...pending,
...reservoir.splice( 0, highWaterMark - pending.length ).map( value => promiseClass( value ) )
];
yield Promise.allSettled( resolved );
}
}
const getTimeout = delay => new Promise( ( resolve, reject ) => {
setTimeout(resolve, delay, delay);
} );
( async () => {
const test = [ 1100, 1200, 1300, 10000, 11000, 9000, 5000, 6000, 3000, 4000, 1000, 2000, 3500 ];
const throttledRequests = throttle( test, getTimeout, 4 );
for await ( const timeout of throttledRequests ) {
console.log( timeout );
}
} )();
The concurrent function below will return a Promise which resolves to an array of resolved promise values, while implementing a concurrency limit. No 3rd party library.
// waits 50 ms then resolves to the passed-in arg
const sleepAndResolve = s => new Promise(rs => setTimeout(()=>rs(s), 50))
// queue 100 promises
const funcs = []
for(let i=0; i<100; i++) funcs.push(()=>sleepAndResolve(i))
//run the promises with a max concurrency of 10
concurrent(10,funcs)
.then(console.log) // prints [0,1,2...,99]
.catch(()=>console.log("there was an error"))
/**
* Run concurrent promises with a maximum concurrency level
* #param concurrency The number of concurrently running promises
* #param funcs An array of functions that return promises
* #returns a promise that resolves to an array of the resolved values from the promises returned by funcs
*/
function concurrent(concurrency, funcs) {
return new Promise((resolve, reject) => {
let index = -1;
const p = [];
for (let i = 0; i < Math.max(1, Math.min(concurrency, funcs.length)); i++)
runPromise();
function runPromise() {
if (++index < funcs.length)
(p[p.length] = funcs[index]()).then(runPromise).catch(reject);
else if (index === funcs.length)
Promise.all(p).then(resolve).catch(reject);
}
});
}
Here's the Typescript version if you are interested
/**
* Run concurrent promises with a maximum concurrency level
* #param concurrency The number of concurrently running promises
* #param funcs An array of functions that return promises
* #returns a promise that resolves to an array of the resolved values from the promises returned by funcs
*/
function concurrent<V>(concurrency:number, funcs:(()=>Promise<V>)[]):Promise<V[]> {
return new Promise((resolve,reject)=>{
let index = -1;
const p:Promise<V>[] = []
for(let i=0; i<Math.max(1,Math.min(concurrency, funcs.length)); i++) runPromise()
function runPromise() {
if (++index < funcs.length) (p[p.length] = funcs[index]()).then(runPromise).catch(reject)
else if (index === funcs.length) Promise.all(p).then(resolve).catch(reject)
}
})
}
No external libraries. Just plain JS.
It can be resolved using recursion.
The idea is that initially we immediately execute the maximum allowed number of queries and each of these queries should recursively initiate a new query on its completion.
In this example I populate successful responses together with errors and I execute all queries but it's possible to slightly modify algorithm if you want to terminate batch execution on the first failure.
async function batchQuery(queries, limit) {
limit = Math.min(queries.length, limit);
return new Promise((resolve, reject) => {
const responsesOrErrors = new Array(queries.length);
let startedCount = 0;
let finishedCount = 0;
let hasErrors = false;
function recursiveQuery() {
let index = startedCount++;
doQuery(queries[index])
.then(res => {
responsesOrErrors[index] = res;
})
.catch(error => {
responsesOrErrors[index] = error;
hasErrors = true;
})
.finally(() => {
finishedCount++;
if (finishedCount === queries.length) {
hasErrors ? reject(responsesOrErrors) : resolve(responsesOrErrors);
} else if (startedCount < queries.length) {
recursiveQuery();
}
});
}
for (let i = 0; i < limit; i++) {
recursiveQuery();
}
});
}
async function doQuery(query) {
console.log(`${query} started`);
const delay = Math.floor(Math.random() * 1500);
return new Promise((resolve, reject) => {
setTimeout(() => {
if (delay <= 1000) {
console.log(`${query} finished successfully`);
resolve(`${query} success`);
} else {
console.log(`${query} finished with error`);
reject(`${query} error`);
}
}, delay);
});
}
const queries = new Array(10).fill('query').map((query, index) => `${query}_${index + 1}`);
batchQuery(queries, 3)
.then(responses => console.log('All successfull', responses))
.catch(responsesWithErrors => console.log('All with several failed', responsesWithErrors));
So I tried to make some examples shown work for my code, but since this was only for an import script and not production code, using the npm package batch-promises was surely the easiest path for me
NOTE: Requires runtime to support Promise or to be polyfilled.
Api
batchPromises(int: batchSize, array: Collection, i => Promise: Iteratee)
The Promise: Iteratee will be called after each batch.
Use:
batch-promises
Easily batch promises
NOTE: Requires runtime to support Promise or to be polyfilled.
Api
batchPromises(int: batchSize, array: Collection, i => Promise: Iteratee)
The Promise: Iteratee will be called after each batch.
Use:
import batchPromises from 'batch-promises';
batchPromises(2, [1,2,3,4,5], i => new Promise((resolve, reject) => {
// The iteratee will fire after each batch resulting in the following behaviour:
// # 100ms resolve items 1 and 2 (first batch of 2)
// # 200ms resolve items 3 and 4 (second batch of 2)
// # 300ms resolve remaining item 5 (last remaining batch)
setTimeout(() => {
resolve(i);
}, 100);
}))
.then(results => {
console.log(results); // [1,2,3,4,5]
});
Recursion is the answer if you don't want to use external libraries
downloadAll(someArrayWithData){
var self = this;
var tracker = function(next){
return self.someExpensiveRequest(someArrayWithData[next])
.then(function(){
next++;//This updates the next in the tracker function parameter
if(next < someArrayWithData.length){//Did I finish processing all my data?
return tracker(next);//Go to the next promise
}
});
}
return tracker(0);
}
expanding on the answer posted by #deceleratedcaviar, I created a 'batch' utility function that takes as argument: array of values, concurrency limit and processing function. Yes I realize that using Promise.all this way is more akin to batch processing vs true concurrency, but if the goal is to limit excessive number of HTTP calls at one time I go with this approach due to its simplicity and no need for external library.
async function batch(o) {
let arr = o.arr
let resp = []
while (arr.length) {
let subset = arr.splice(0, o.limit)
let results = await Promise.all(subset.map(o.process))
resp.push(results)
}
return [].concat.apply([], resp)
}
let arr = []
for (let i = 0; i < 250; i++) { arr.push(i) }
async function calc(val) { return val * 100 }
(async () => {
let resp = await batch({
arr: arr,
limit: 100,
process: calc
})
console.log(resp)
})();
One more solution with a custom promise library (CPromise):
using generators Live codesandbox demo
import { CPromise } from "c-promise2";
import cpFetch from "cp-fetch";
const promise = CPromise.all(
function* () {
const urls = [
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=1",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=2",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=3",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=4",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=5",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=6",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=7"
];
for (const url of urls) {
yield cpFetch(url); // add a promise to the pool
console.log(`Request [${url}] completed`);
}
},
{ concurrency: 2 }
).then(
(v) => console.log(`Done: `, v),
(e) => console.warn(`Failed: ${e}`)
);
// yeah, we able to cancel the task and abort pending network requests
// setTimeout(() => promise.cancel(), 4500);
using mapper Live codesandbox demo
import { CPromise } from "c-promise2";
import cpFetch from "cp-fetch";
const promise = CPromise.all(
[
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=1",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=2",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=3",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=4",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=5",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=6",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=7"
],
{
mapper: (url) => {
console.log(`Request [${url}]`);
return cpFetch(url);
},
concurrency: 2
}
).then(
(v) => console.log(`Done: `, v),
(e) => console.warn(`Failed: ${e}`)
);
// yeah, we able to cancel the task and abort pending network requests
//setTimeout(() => promise.cancel(), 4500);
Warning this has not been benchmarked for efficiency and does a lot of array copying/creation
If you want a more functional approach you could do something like:
import chunk from 'lodash.chunk';
const maxConcurrency = (max) => (dataArr, promiseFn) =>
chunk(dataArr, max).reduce(
async (agg, batch) => [
...(await agg),
...(await Promise.all(batch.map(promiseFn)))
],
[]
);
and then to you could use it like:
const randomFn = (data) =>
new Promise((res) => setTimeout(
() => res(data + 1),
Math.random() * 1000
));
const result = await maxConcurrency(5)(
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
randomFn
);
console.log('result+++', result);
I had been using the bottleneck library, which I actually really liked, but in my case wasn't releasing memory and kept tanking long running jobs... Which isn't great for running the massive jobs that you likely want a throttling/concurrency library for in the first place.
I needed a simple, low-overhead, easy to maintain solution. I also wanted something that kept the pool topped up, rather than simply batching predefined chunks... In the case of a downloader, this will stop that nGB file from holding up your queue for minutes/hours at a time, even though the rest of the batch finished ages ago.
This is the Node.js v16+, no-dependency, async generator solution I've been using instead:
const promiseState = function( promise ) {
// A promise could never resolve to a unique symbol unless it was in this scope
const control = Symbol();
// This helps us determine the state of the promise... A little heavy, but it beats a third-party promise library. The control is the second element passed to Promise.race() since it will only resolve first if the promise being tested is pending.
return Promise
.race([ promise, control ])
.then( value => ( value === control ) ? 'pending' : 'fulfilled' )
.catch( () => 'rejected' );
}
const throttle = async function* ( reservoir, promiseFunction, highWaterMark ) {
let iterable = reservoir.splice( 0, highWaterMark ).map( item => promiseFunction( item ) );
while ( iterable.length > 0 ) {
// When a promise has resolved we have space to top it up to the high water mark...
await Promise.any( iterable );
const pending = [];
const resolved = [];
// This identifies the promise(s) that have resolved so that we can yield them
for ( const currentValue of iterable ) {
if ( await promiseState( currentValue ) === 'pending' ) {
pending.push( currentValue );
} else {
resolved.push( currentValue );
}
}
// Put the remaining promises back into iterable, and top it to the high water mark
iterable = [
...pending,
...reservoir.splice( 0, highWaterMark - pending.length ).map( value => promiseFunction( value ) )
];
yield Promise.allSettled( resolved );
}
}
// This is just an example of what would get passed as "promiseFunction"... This can be the function that returns your HTTP request promises
const getTimeout = delay => new Promise( (resolve, reject) => setTimeout(resolve, delay, delay) );
// This is just the async IIFE that bootstraps this example
( async () => {
const test = [ 1000, 2000, 3000, 4000, 5000, 6000, 1500, 2500, 3500, 4500, 5500, 6500 ];
for await ( const timeout of throttle( test, getTimeout, 4 ) ) {
console.log( timeout );
}
} )();
I have solution with creating chunks and using .reduce function to wait each chunks promise.alls to be finished. And also I add some delay if the promises have some call limits.
export function delay(ms: number) {
return new Promise<void>((resolve) => setTimeout(resolve, ms));
}
export const chunk = <T>(arr: T[], size: number): T[][] => [
...Array(Math.ceil(arr.length / size)),
].map((_, i) => arr.slice(size * i, size + size * i));
const myIdlist = []; // all items
const groupedIdList = chunk(myIdList, 20); // grouped by 20 items
await groupedIdList.reduce(async (prev, subIdList) => {
await prev;
// Make sure we wait for 500 ms after processing every page to prevent overloading the calls.
const data = await Promise.all(subIdList.map(myPromise));
await delay(500);
}, Promise.resolve());
Using tiny-async-pool ES9 for await...of API, you can do the following:
const asyncPool = require("tiny-async-pool");
const getCount = async (user) => ([user, remoteServer.getCount(user)]);
const concurrency = 2;
for await (const [user, count] of asyncPool(concurrency, users, getCount)) {
console.log(user, count);
}
The above asyncPool function returns an async iterator that yields as soon as a promise completes (under concurrency limit) and it rejects immediately as soon as one of the promises rejects.
It is possible to limit requests to server by using https://www.npmjs.com/package/job-pipe
Basically you create a pipe and tell it how many concurrent requests you want:
const pipe = createPipe({ throughput: 6, maxQueueSize: Infinity })
Then you take your function which performs call and force it through the pipe to create a limited amount of calls at the same time:
const makeCall = async () => {...}
const limitedMakeCall = pipe(makeCall)
Finally, you call this method as many times as you need as if it was unchanged and it will limit itself on how many parallel executions it can handle:
await limitedMakeCall()
await limitedMakeCall()
await limitedMakeCall()
await limitedMakeCall()
await limitedMakeCall()
....
await limitedMakeCall()
Profit.
I suggest not downloading packages and not writing hundreds of lines of code:
async function async_arr<T1, T2>(
arr: T1[],
func: (x: T1) => Promise<T2> | T2, //can be sync or async
limit = 5
) {
let results: T2[] = [];
let workers = [];
let current = Math.min(arr.length, limit);
async function process(i) {
if (i < arr.length) {
results[i] = await Promise.resolve(func(arr[i]));
await process(current++);
}
}
for (let i = 0; i < current; i++) {
workers.push(process(i));
}
await Promise.all(workers);
return results;
}
Here's my recipe, based on killdash9's answer.
It allows to choose the behaviour on exceptions (Promise.all vs Promise.allSettled).
// Given an array of async functions, runs them in parallel,
// with at most maxConcurrency simultaneous executions
// Except for that, behaves the same as Promise.all,
// unless allSettled is true, where it behaves as Promise.allSettled
function concurrentRun(maxConcurrency = 10, funcs = [], allSettled = false) {
if (funcs.length <= maxConcurrency) {
const ps = funcs.map(f => f());
return allSettled ? Promise.allSettled(ps) : Promise.all(ps);
}
return new Promise((resolve, reject) => {
let idx = -1;
const ps = new Array(funcs.length);
function nextPromise() {
idx += 1;
if (idx < funcs.length) {
(ps[idx] = funcs[idx]()).then(nextPromise).catch(allSettled ? nextPromise : reject);
} else if (idx === funcs.length) {
(allSettled ? Promise.allSettled(ps) : Promise.all(ps)).then(resolve).catch(reject);
}
}
for (let i = 0; i < maxConcurrency; i += 1) nextPromise();
});
}
I know there are a lot of answers already, but I ended up using a very simple, no library or sleep required, solution that uses only a few commands. Promise.all() simply lets you know when all the promises passed to it are finalized. So, you can check on the queue intermittently to see if it is ready for more work, if so, add more processes.
For example:
// init vars
const batchSize = 5
const calls = []
// loop through data and run processes
for (let [index, data] of [1,2,3].entries()) {
// pile on async processes
calls.push(doSomethingAsyncWithData(data))
// every 5th concurrent call, wait for them to finish before adding more
if (index % batchSize === 0) await Promise.all(calls)
}
// clean up for any data to process left over if smaller than batch size
const allFinishedProcs = await Promise.all(calls)
A good solution for controlling the maximum number of promises/requests is to split your list of requests into pages, and produce only requests for one page at a time.
The example below makes use of iter-ops library:
import {pipeAsync, map, page} from 'iter-ops';
const i = pipeAsync(
users, // make it asynchronous
page(10), // split into pages of 10 items in each
map(p => Promise.all(p.map(u => u.remoteServer.getCount(u)))), // map into requests
wait() // resolve each page in the pipeline
);
// below triggers processing page-by-page:
for await(const p of i) {
//=> p = resolved page of data
}
This way it won't try to create more requests/promises than the size of one page.

getting same response from a promise multiple times

I have a function that would return a promise, and in the case of an error, I have to call the same function again. The problem is that whenever I call it again, I get the same response, as if it was never called again.
This is how am resolving:
first_file = async () => {
return new Promise(async (resolve, reject) => {
//Generating the token
(async () => {
while (true) {
console.log("Resolving...");
resolve(token);
await sleep(5000);
resolved_token = token;
}
})();
});
};
I'm generating a token here, which I use in the second script:
function sleep(ms) {
return new Promise((resolve) => {
setTimeout(resolve, ms);
});
}
(async() =>{
while(true){
test = require("./test")
test.first_file ().then(res=>{
console.log(res)
})
await sleep(15000)
}
})()
The expected value here is that every 15000ms (15 sec) I get a new response, but here I'm getting the same response over and over again.
Sorry if the title is inaccurate; I didn't know how to explain the problem.
Promises represent a value + time, a promise's settled value doesn't change like the number 5 doesn't change. Calling resolve multiple times is a no-op*.
What you want to do instead of using the language's abstraction for value + time is to use the language's abstraction for action + time - an async function (or just a function returning a promise)
const tokenFactory = () => {
let current = null;
(async () =>
while (true) {
console.log("Resolving...");
current = token; // get token somewhere
await sleep(5000);
}
})().catch((e) => {/* handle error */});
return () => current; // we return a function so it's captured
};
Which will let you do:
tokenFactory(); // first token (or null)
// 5 seconds later
tokenFactory(); // second token
*We have a flag we added in Node.js called multipleResolves that will let you observe that for logging/error handling

Calculate total elapsed time of Promises till reject?

I want to test how much requests i can do and get their total time elapsed. My Promise function
async execQuery(response, query) {
let request = new SQL.Request();
return new Promise((resolve, reject) => {
request.query(query, (error, result) => {
if (error) {
reject(error);
} else {
resolve(result);
}
});
});
}
And my api
app.get('/api/bookings/:uid', (req, res) => {
let st = new stopwatch();
let id = req.params.uid;
let query = `SELECT * FROM booking.TransactionDetails WHERE UID='${id}'`;
for (let i = 0; i < 10000; i++) {
st.start();
db.execQuery(res, query);
}
});
I can't stop the for loop since its async but I also don't know how can I stop executing other calls after the one which first rejects so i can get the counter and the elapsed time of all successful promises. How can i achieve that?
You can easily create a composable wrapper for this, or a subclass:
Inheritance:
class TimedPromise extends Promise {
constructor(executor) {
this.startTime = performance.now(); // or Date.now
super(executor);
let end = () => this.endTime = performance.now();
this.then(end, end); // replace with finally when available
}
get time() {
return this.startTime - this.endTime; // time in milliseconds it took
}
}
Then you can use methods like:
TimedPromise.all(promises);
TimedPromise.race(promises);
var foo = new TimedPromise(resolve => setTimeout(resolve, 100);
let res = await foo;
console.log(foo.time); // how long foo took
Plus then chaining would work, async functions won't (since they always return native promises).
Composition:
function time(promise) {
var startTime = performance.now(), endTime;
let end = () => endTime = performance.now();
promise.then(end, end); // replace with finally when appropriate.
return () => startTime - endTime;
}
Then usage is:
var foo = new Promise(resolve => setTimeout(resolve, 100);
var timed = time(foo);
await foo;
console.log(timed()); // how long foo took
This has the advantage of working everywhere, but the disadvantage of manually having to time every promise. I prefer this approach for its explicitness and arguably nicer design.
As a caveat, since a rejection handler is attached, you have to be 100% sure you're adding your own .catch or then handler since otherwise the error will not log to the console.
Wouldn't this work in your promise ?
new Promise((resolve, reject) => {
var time = Date.now();
request.query(query, (error, result) => {
if (error) {
reject(error);
} else {
resolve(result);
}
});
}).then(function(r){
//code
}).catch(function(e){
console.log('it took : ', Date.now() - time);
});
Or put the .then and .catch after your db.execQuery() call
You made 2 comments that would indicate you want to stop all on going queries when a promise fails but fail to mention what SQL is and if request.query is something that you can cancel.
In your for loop you already ran all the request.query statements, if you want to run only one query and then the other you have to do request.query(query).then(-=>request.query(query)).then... but it'll take longer because you don't start them all at once.
Here is code that would tell you how long all the queries took but I think you should tell us what SQL is so we could figure out how to set connection pooling and caching (probably the biggest performance gainer).
//removed the async, this function does not await anything
// so there is no need for async
//removed initializing request, you can re use the one created in
// the run function, that may shave some time off total runtime
// but not sure if request can share connections (in that case)
// it's better to create a couple and pass them along as their
// connection becomes available (connection pooling)
const execQuery = (response, query, request) =>
new Promise(
(resolve, reject) =>
request.query(
query
,(error, result) =>
(error)
? reject(error)
: resolve(result)
)
);
// save failed queries and resolve them with Fail object
const Fail = function(detail){this.detail=detail;};
// let request = new SQL.Request();
const run = (numberOfTimes) => {
const start = new Date().getTime();
const request = new SQL.Request();
Promise.all(
(x=>{
for (let i = 0; i < numberOfTimes; i++) {
let query = `SELECT * FROM booking.TransactionDetails WHERE UID='${i}'`;
db.execQuery(res, query, request)
.then(
x=>[x,query]
,err=>[err,query]
)
}
})()//IIFE creating array of promises
)
.then(
results => {
const totalRuntime = new Date().getTime()-start;
const failed = results.filter(r=>(r&&r.constructor)===Fail);
console.log(`Total runtime in ms:${totalRuntime}
Failed:${failed.length}
Succeeded:${results.length-failed.length}`);
}
)
};
//start the whole thing with:
run(10000);

If I await 2 functions can I guarantee that the return object will have both values

I have the following code
module.exports = async function (req, res) {
const station1 = await getStation('one')
const station2 = await getStation('two')
return { stations: [station1, station2] }
}
Can I be guaranteed that when the final return value is sent it will definitely have both station1 and station2 data in them, or do I need to wrap the function call in a Promise.all()
As you have it, it is guaranteed that the return statement will only be executed when the two getStation() promises have resolved.
However, the second call to getStation will only happen when the first promise has resolved, making them run in serial. As there is no dependency between them, you could gain performance, if you would run them in parallel.
Although this can be achieved with Promise.all, you can achieve the same by first retrieving the two promises and only then performing the await on them:
module.exports = async function (req, res) {
const promise1 = getStation('one');
const promise2 = getStation('two');
return { stations: [await promise1, await promise2] }
}
Now both calls will be performed at the "same" time, and it will be just the return statement that will be pending for both promises to resolve. This is also illustrated in MDN's "simple example".
The await keyword actually makes you "wait" on the line of code, while running an async action.
That means that you don't proceed to the next line of code until the async action is resolved. This is good if your code has a dependency with the result.
Example:
const res1 = await doSomething();
if(res1.isValid)
{
console.log('do something with res1 result');
}
The following code example will await a promise that gets resolved after three seconds. Check the date prints to the console to understand what await does:
async function f1() {
console.log(new Date());
// Resolve after three seconds
var p = new Promise(resolve => {
setTimeout(() => resolve({}),3000);
});
await p;
console.log(new Date());
}
f1();
ES6Console
BTW, In your case, since you don't use the result of station1 it's better using Promise.all to work parallel.
Check this example (it will run for 3 seconds instead of 4 seconds the way you coded above):
async function f1() {
console.log(new Date());
// Resolve after three seconds
var p1 = new Promise(resolve => {
setTimeout(() => resolve({a:1}),3000);
});
// Resolve after one second
var p2 = new Promise(resolve => {
setTimeout(() => resolve({a:2}),1000);
});
// Run them parallel - Max(three seconds, one second) -> three seconds.
var res = await Promise.all([p1,p2]);
console.log(new Date());
console.log('result:' + res);
}
f1();
ES6Console.
If either of await getStation('one') or await getStation('two') fails an exception will be thrown from the async function. So you should always get the resolved value from both promises.
You can rewrite your function as follows to use Promise.all
module.exports = async function (req, res) {
try{
const [station1, station2] = await Promise.all([
getStation('one'),
getStation('two')
]);
return { stations: [station1, station2] };
} catch (e) {
throw e;
}
}

What is the best way to limit concurrency when using ES6's Promise.all()?

I have some code that is iterating over a list that was queried out of a database and making an HTTP request for each element in that list. That list can sometimes be a reasonably large number (in the thousands), and I would like to make sure I am not hitting a web server with thousands of concurrent HTTP requests.
An abbreviated version of this code currently looks something like this...
function getCounts() {
return users.map(user => {
return new Promise(resolve => {
remoteServer.getCount(user) // makes an HTTP request
.then(() => {
/* snip */
resolve();
});
});
});
}
Promise.all(getCounts()).then(() => { /* snip */});
This code is running on Node 4.3.2. To reiterate, can Promise.all be managed so that only a certain number of Promises are in progress at any given time?
P-Limit
I have compared promise concurrency limitation with a custom script, bluebird, es6-promise-pool, and p-limit. I believe that p-limit has the most simple, stripped down implementation for this need. See their documentation.
Requirements
To be compatible with async in example
ECMAScript 2017 (version 8)
Node version > 8.2.1
My Example
In this example, we need to run a function for every URL in the array (like, maybe an API request). Here this is called fetchData(). If we had an array of thousands of items to process, concurrency would definitely be useful to save on CPU and memory resources.
const pLimit = require('p-limit');
// Example Concurrency of 3 promise at once
const limit = pLimit(3);
let urls = [
"http://www.exampleone.com/",
"http://www.exampletwo.com/",
"http://www.examplethree.com/",
"http://www.examplefour.com/",
]
// Create an array of our promises using map (fetchData() returns a promise)
let promises = urls.map(url => {
// wrap the function we are calling in the limit function we defined above
return limit(() => fetchData(url));
});
(async () => {
// Only three promises are run at once (as defined above)
const result = await Promise.all(promises);
console.log(result);
})();
The console log result is an array of your resolved promises response data.
Using Array.prototype.splice
while (funcs.length) {
// 100 at a time
await Promise.all( funcs.splice(0, 100).map(f => f()) )
}
Note that Promise.all() doesn't trigger the promises to start their work, creating the promise itself does.
With that in mind, one solution would be to check whenever a promise is resolved whether a new promise should be started or whether you're already at the limit.
However, there is really no need to reinvent the wheel here. One library that you could use for this purpose is es6-promise-pool. From their examples:
var PromisePool = require('es6-promise-pool')
var promiseProducer = function () {
// Your code goes here.
// If there is work left to be done, return the next work item as a promise.
// Otherwise, return null to indicate that all promises have been created.
// Scroll down for an example.
}
// The number of promises to process simultaneously.
var concurrency = 3
// Create a pool.
var pool = new PromisePool(promiseProducer, concurrency)
// Start the pool.
var poolPromise = pool.start()
// Wait for the pool to settle.
poolPromise.then(function () {
console.log('All promises fulfilled')
}, function (error) {
console.log('Some promise rejected: ' + error.message)
})
If you know how iterators work and how they are consumed you would't need any extra library, since it can become very easy to build your own concurrency yourself. Let me demonstrate:
/* [Symbol.iterator]() is equivalent to .values()
const iterator = [1,2,3][Symbol.iterator]() */
const iterator = [1,2,3].values()
// loop over all items with for..of
for (const x of iterator) {
console.log('x:', x)
// notices how this loop continues the same iterator
// and consumes the rest of the iterator, making the
// outer loop not logging any more x's
for (const y of iterator) {
console.log('y:', y)
}
}
We can use the same iterator and share it across workers.
If you had used .entries() instead of .values() you would have gotten a iterator that yields [index, value] which i will demonstrate below with a concurrency of 2
const sleep = t => new Promise(rs => setTimeout(rs, t))
const iterator = Array.from('abcdefghij').entries()
// const results = [] || Array(someLength)
async function doWork (iterator, i) {
for (let [index, item] of iterator) {
await sleep(1000)
console.log(`Worker#${i}: ${index},${item}`)
// in case you need to store the results in order
// results[index] = item + item
// or if the order dose not mather
// results.push(item + item)
}
}
const workers = Array(2).fill(iterator).map(doWork)
// ^--- starts two workers sharing the same iterator
Promise.allSettled(workers).then(console.log.bind(null, 'done'))
The benefit of this is that you can have a generator function instead of having everything ready at once.
What's even more awesome is that you can do stream.Readable.from(iterator) in node (and eventually in whatwg streams as well). and with transferable ReadbleStream, this makes this potential very useful in the feature if you are working with web workers also for performances
Note: the different from this compared to example async-pool is that it spawns two workers, so if one worker throws an error for some reason at say index 5 it won't stop the other worker from doing the rest. So you go from doing 2 concurrency down to 1. (so it won't stop there) So my advise is that you catch all errors inside the doWork function
Instead of using promises for limiting http requests, use node's built-in http.Agent.maxSockets. This removes the requirement of using a library or writing your own pooling code, and has the added advantage more control over what you're limiting.
agent.maxSockets
By default set to Infinity. Determines how many concurrent sockets the agent can have open per origin. Origin is either a 'host:port' or 'host:port:localAddress' combination.
For example:
var http = require('http');
var agent = new http.Agent({maxSockets: 5}); // 5 concurrent connections per origin
var request = http.request({..., agent: agent}, ...);
If making multiple requests to the same origin, it might also benefit you to set keepAlive to true (see docs above for more info).
bluebird's Promise.map can take a concurrency option to control how many promises should be running in parallel. Sometimes it is easier than .all because you don't need to create the promise array.
const Promise = require('bluebird')
function getCounts() {
return Promise.map(users, user => {
return new Promise(resolve => {
remoteServer.getCount(user) // makes an HTTP request
.then(() => {
/* snip */
resolve();
});
});
}, {concurrency: 10}); // <---- at most 10 http requests at a time
}
As all others in this answer thread have pointed out, Promise.all() won't do the right thing if you need to limit concurrency. But ideally you shouldn't even want to wait until all of the Promises are done before processing them.
Instead, you want to process each result ASAP as soon as it becomes available, so you don't have to wait for the very last promise to finish before you start iterating over them.
So, here's a code sample that does just that, based partly on the answer by Endless and also on this answer by T.J. Crowder.
// example tasks that sleep and return a number
// in real life, you'd probably fetch URLs or something
const tasks = [];
for (let i = 0; i < 20; i++) {
tasks.push(async () => {
console.log(`start ${i}`);
await sleep(Math.random() * 1000);
console.log(`end ${i}`);
return i;
});
}
function sleep(ms) { return new Promise(r => setTimeout(r, ms)); }
(async () => {
for await (let value of runTasks(3, tasks.values())) {
console.log(`output ${value}`);
}
})();
async function* runTasks(maxConcurrency, taskIterator) {
// Each async iterator is a worker, polling for tasks from the shared taskIterator
// Sharing the iterator ensures that each worker gets unique tasks.
const asyncIterators = new Array(maxConcurrency);
for (let i = 0; i < maxConcurrency; i++) {
asyncIterators[i] = (async function* () {
for (const task of taskIterator) yield await task();
})();
}
yield* raceAsyncIterators(asyncIterators);
}
async function* raceAsyncIterators(asyncIterators) {
async function nextResultWithItsIterator(iterator) {
return { result: await iterator.next(), iterator: iterator };
}
/** #type Map<AsyncIterator<T>,
Promise<{result: IteratorResult<T>, iterator: AsyncIterator<T>}>> */
const promises = new Map(asyncIterators.map(iterator =>
[iterator, nextResultWithItsIterator(iterator)]));
while (promises.size) {
const { result, iterator } = await Promise.race(promises.values());
if (result.done) {
promises.delete(iterator);
} else {
promises.set(iterator, nextResultWithItsIterator(iterator));
yield result.value;
}
}
}
There's a lot of magic in here; let me explain.
This solution is built around async generator functions, which many JS developers may not be familiar with.
A generator function (aka function* function) returns a "generator," an iterator of results. Generator functions are allowed to use the yield keyword where you might have normally used a return keyword. The first time a caller calls next() on the generator (or uses a for...of loop), the function* function runs until it yields a value; that becomes the next() value of the iterator. But the subsequent time next() is called, the generator function resumes from the yield statement, right where it left off, even if it's in the middle of a loop. (You can also yield*, to yield all of the results of another generator function.)
An "async generator function" (async function*) is a generator function that returns an "async iterator," which is an iterator of promises. You can call for await...of on an async iterator. Async generator functions can use the await keyword, as you might do in any async function.
In the example, we call runTasks() with an array of task functions. runTasks() is an async generator function, so we can call it with a for await...of loop. Each time the loop runs, we'll process the result of the latest completed task.
runTasks() creates N async iterators, the workers. (Note that the workers are initially defined as async generator functions, but we immediately invoke each function, and store each resulting async iterator in the asyncIterators array.) The example calls runTasks with 3 concurrent workers, so no more than 3 tasks are launched at the same time. When any task completes, we immediately queue up the next task. (This is superior to "batching", where you do 3 tasks at once, await all three of them, and don't start the next batch of three until the entire previous batch has finished.)
runTasks() concludes by "racing" its async iterators with yield* raceAsyncIterators(). raceAsyncIterators() is like Promise.race() but it races N iterators of Promises instead of just N Promises; it returns an async iterator that yields the results of resolved Promises.
raceAsyncIterators() starts by defining a promises Map from each of the iterators to promises. Each promise is a promise for an iteration result along with the iterator that generated it.
With the promises map, we can Promise.race() the values of the map, giving us the winning iteration result and its iterator. If the iterator is completely done, we remove it from the map; otherwise we replace its Promise in the promises map with the iterator's next() Promise and yield result.value.
In conclusion, runTasks() is an async generator function that yields the results of racing N concurrent async iterators of tasks, so the end user can just for await (let value of runTasks(3, tasks.values())) to process each result as soon as it becomes available.
I suggest the library async-pool: https://github.com/rxaviers/async-pool
npm install tiny-async-pool
Description:
Run multiple promise-returning & async functions with limited concurrency using native ES6/ES7
asyncPool runs multiple promise-returning & async functions in a limited concurrency pool. It rejects immediately as soon as one of the promises rejects. It resolves when all the promises completes. It calls the iterator function as soon as possible (under concurrency limit).
Usage:
const timeout = i => new Promise(resolve => setTimeout(() => resolve(i), i));
await asyncPool(2, [1000, 5000, 3000, 2000], timeout);
// Call iterator (i = 1000)
// Call iterator (i = 5000)
// Pool limit of 2 reached, wait for the quicker one to complete...
// 1000 finishes
// Call iterator (i = 3000)
// Pool limit of 2 reached, wait for the quicker one to complete...
// 3000 finishes
// Call iterator (i = 2000)
// Itaration is complete, wait until running ones complete...
// 5000 finishes
// 2000 finishes
// Resolves, results are passed in given array order `[1000, 5000, 3000, 2000]`.
Here is my ES7 solution to a copy-paste friendly and feature complete Promise.all()/map() alternative, with a concurrency limit.
Similar to Promise.all() it maintains return order as well as a fallback for non promise return values.
I also included a comparison of the different implementation as it illustrates some aspects a few of the other solutions have missed.
Usage
const asyncFn = delay => new Promise(resolve => setTimeout(() => resolve(), delay));
const args = [30, 20, 15, 10];
await asyncPool(args, arg => asyncFn(arg), 4); // concurrency limit of 4
Implementation
async function asyncBatch(args, fn, limit = 8) {
// Copy arguments to avoid side effects
args = [...args];
const outs = [];
while (args.length) {
const batch = args.splice(0, limit);
const out = await Promise.all(batch.map(fn));
outs.push(...out);
}
return outs;
}
async function asyncPool(args, fn, limit = 8) {
return new Promise((resolve) => {
// Copy arguments to avoid side effect, reverse queue as
// pop is faster than shift
const argQueue = [...args].reverse();
let count = 0;
const outs = [];
const pollNext = () => {
if (argQueue.length === 0 && count === 0) {
resolve(outs);
} else {
while (count < limit && argQueue.length) {
const index = args.length - argQueue.length;
const arg = argQueue.pop();
count += 1;
const out = fn(arg);
const processOut = (out, index) => {
outs[index] = out;
count -= 1;
pollNext();
};
if (typeof out === 'object' && out.then) {
out.then(out => processOut(out, index));
} else {
processOut(out, index);
}
}
}
};
pollNext();
});
}
Comparison
// A simple async function that returns after the given delay
// and prints its value to allow us to determine the response order
const asyncFn = delay => new Promise(resolve => setTimeout(() => {
console.log(delay);
resolve(delay);
}, delay));
// List of arguments to the asyncFn function
const args = [30, 20, 15, 10];
// As a comparison of the different implementations, a low concurrency
// limit of 2 is used in order to highlight the performance differences.
// If a limit greater than or equal to args.length is used the results
// would be identical.
// Vanilla Promise.all/map combo
const out1 = await Promise.all(args.map(arg => asyncFn(arg)));
// prints: 10, 15, 20, 30
// total time: 30ms
// Pooled implementation
const out2 = await asyncPool(args, arg => asyncFn(arg), 2);
// prints: 20, 30, 15, 10
// total time: 40ms
// Batched implementation
const out3 = await asyncBatch(args, arg => asyncFn(arg), 2);
// prints: 20, 30, 20, 30
// total time: 45ms
console.log(out1, out2, out3); // prints: [30, 20, 15, 10] x 3
// Conclusion: Execution order and performance is different,
// but return order is still identical
Conclusion
asyncPool() should be the best solution as it allows new requests to start as soon as any previous one finishes.
asyncBatch() is included as a comparison as its implementation is simpler to understand, but it should be slower in performance as all requests in the same batch is required to finish in order to start the next batch.
In this contrived example, the non-limited vanilla Promise.all() is of course the fastest, while the others could perform more desirable in a real world congestion scenario.
Update
The async-pool library that others have already suggested is probably a better alternative to my implementation as it works almost identically and has a more concise implementation with a clever usage of Promise.race(): https://github.com/rxaviers/async-pool/blob/master/lib/es7.js
Hopefully my answer can still serve an educational value.
Semaphore is well known concurrency primitive that was designed to solve similar problems. It's very universal construct, implementations of Semaphore exist in many languages. This is how one would use Semaphore to solve this issue:
async function main() {
const s = new Semaphore(100);
const res = await Promise.all(
entities.map((users) =>
s.runExclusive(() => remoteServer.getCount(user))
)
);
return res;
}
I'm using implementation of Semaphore from async-mutex, it has decent documentation and TypeScript support.
If you want to dig deep into topics like this you can take a look at the book "The Little Book of Semaphores" which is freely available as PDF here
Unfortunately there is no way to do it with native Promise.all, so you have to be creative.
This is the quickest most concise way I could find without using any outside libraries.
It makes use of a newer javascript feature called an iterator. The iterator basically keeps track of what items have been processed and what haven't.
In order to use it in code, you create an array of async functions. Each async function asks the same iterator for the next item that needs to be processed. Each function processes its own item asynchronously, and when done asks the iterator for a new one. Once the iterator runs out of items, all the functions complete.
Thanks to #Endless for inspiration.
const items = [
'https://httpbin.org/bytes/2',
'https://httpbin.org/bytes/2',
'https://httpbin.org/bytes/2',
'https://httpbin.org/bytes/2',
'https://httpbin.org/bytes/2',
'https://httpbin.org/bytes/2',
'https://httpbin.org/bytes/2',
'https://httpbin.org/bytes/2'
]
// get a cursor that keeps track of what items have already been processed.
let cursor = items.entries();
// create 5 for loops that each run off the same cursor which keeps track of location
Array(5).fill().forEach(async () => {
for (let [index, url] of cursor){
console.log('getting url is ', index, url)
// run your async task instead of this next line
var text = await fetch(url).then(res => res.text())
console.log('text is', text.slice(0, 20))
}
})
Here goes basic example for streaming and 'p-limit'. It streams http read stream to mongo db.
const stream = require('stream');
const util = require('util');
const pLimit = require('p-limit');
const es = require('event-stream');
const streamToMongoDB = require('stream-to-mongo-db').streamToMongoDB;
const pipeline = util.promisify(stream.pipeline)
const outputDBConfig = {
dbURL: 'yr-db-url',
collection: 'some-collection'
};
const limit = pLimit(3);
async yrAsyncStreamingFunction(readStream) => {
const mongoWriteStream = streamToMongoDB(outputDBConfig);
const mapperStream = es.map((data, done) => {
let someDataPromise = limit(() => yr_async_call_to_somewhere())
someDataPromise.then(
function handleResolve(someData) {
data.someData = someData;
done(null, data);
},
function handleError(error) {
done(error)
}
);
})
await pipeline(
readStream,
JSONStream.parse('*'),
mapperStream,
mongoWriteStream
);
}
So many good solutions. I started out with the elegant solution posted by #Endless and ended up with this little extension method that does not use any external libraries nor does it run in batches (although assumes you have features like async, etc):
Promise.allWithLimit = async (taskList, limit = 5) => {
const iterator = taskList.entries();
let results = new Array(taskList.length);
let workerThreads = new Array(limit).fill(0).map(() =>
new Promise(async (resolve, reject) => {
try {
let entry = iterator.next();
while (!entry.done) {
let [index, promise] = entry.value;
try {
results[index] = await promise;
entry = iterator.next();
}
catch (err) {
results[index] = err;
}
}
// No more work to do
resolve(true);
}
catch (err) {
// This worker is dead
reject(err);
}
}));
await Promise.all(workerThreads);
return results;
};
Promise.allWithLimit = async (taskList, limit = 5) => {
const iterator = taskList.entries();
let results = new Array(taskList.length);
let workerThreads = new Array(limit).fill(0).map(() =>
new Promise(async (resolve, reject) => {
try {
let entry = iterator.next();
while (!entry.done) {
let [index, promise] = entry.value;
try {
results[index] = await promise;
entry = iterator.next();
}
catch (err) {
results[index] = err;
}
}
// No more work to do
resolve(true);
}
catch (err) {
// This worker is dead
reject(err);
}
}));
await Promise.all(workerThreads);
return results;
};
const demoTasks = new Array(10).fill(0).map((v,i) => new Promise(resolve => {
let n = (i + 1) * 5;
setTimeout(() => {
console.log(`Did nothing for ${n} seconds`);
resolve(n);
}, n * 1000);
}));
var results = Promise.allWithLimit(demoTasks);
#tcooc's answer was quite cool. Didn't know about it and will leverage it in the future.
I also enjoyed #MatthewRideout's answer, but it uses an external library!!
Whenever possible, I give a shot at developing this kind of things on my own, rather than going for a library. You end up learning a lot of concepts which seemed daunting before.
class Pool{
constructor(maxAsync) {
this.maxAsync = maxAsync;
this.asyncOperationsQueue = [];
this.currentAsyncOperations = 0
}
runAnother() {
if (this.asyncOperationsQueue.length > 0 && this.currentAsyncOperations < this.maxAsync) {
this.currentAsyncOperations += 1;
this.asyncOperationsQueue.pop()()
.then(() => { this.currentAsyncOperations -= 1; this.runAnother() }, () => { this.currentAsyncOperations -= 1; this.runAnother() })
}
}
add(f){ // the argument f is a function of signature () => Promise
this.runAnother();
return new Promise((resolve, reject) => {
this.asyncOperationsQueue.push(
() => f().then(resolve).catch(reject)
)
})
}
}
//#######################################################
// TESTS
//#######################################################
function dbCall(id, timeout, fail) {
return new Promise((resolve, reject) => {
setTimeout(() => {
if (fail) {
reject(`Error for id ${id}`);
} else {
resolve(id);
}
}, timeout)
}
)
}
const dbQuery1 = () => dbCall(1, 5000, false);
const dbQuery2 = () => dbCall(2, 5000, false);
const dbQuery3 = () => dbCall(3, 5000, false);
const dbQuery4 = () => dbCall(4, 5000, true);
const dbQuery5 = () => dbCall(5, 5000, false);
const cappedPool = new Pool(2);
const dbQuery1Res = cappedPool.add(dbQuery1).catch(i => i).then(i => console.log(`Resolved: ${i}`))
const dbQuery2Res = cappedPool.add(dbQuery2).catch(i => i).then(i => console.log(`Resolved: ${i}`))
const dbQuery3Res = cappedPool.add(dbQuery3).catch(i => i).then(i => console.log(`Resolved: ${i}`))
const dbQuery4Res = cappedPool.add(dbQuery4).catch(i => i).then(i => console.log(`Resolved: ${i}`))
const dbQuery5Res = cappedPool.add(dbQuery5).catch(i => i).then(i => console.log(`Resolved: ${i}`))
This approach provides a nice API, similar to thread pools in scala/java.
After creating one instance of the pool with const cappedPool = new Pool(2), you provide promises to it with simply cappedPool.add(() => myPromise).
Obliviously we must ensure that the promise does not start immediately and that is why we must "provide it lazily" with the help of the function.
Most importantly, notice that the result of the method add is a Promise which will be completed/resolved with the value of your original promise! This makes for a very intuitive use.
const resultPromise = cappedPool.add( () => dbCall(...))
resultPromise
.then( actualResult => {
// Do something with the result form the DB
}
)
This solution uses an async generator to manage concurrent promises with vanilla javascript. The throttle generator takes 3 arguments:
An array of values to be be supplied as arguments to a promise genrating function. (e.g. An array of URLs.)
A function that return a promise. (e.g. Returns a promise for an HTTP request.)
An integer that represents the maximum concurrent promises allowed.
Promises are only instantiated as required in order to reduce memory consumption. Results can be iterated over using a for await...of statement.
The example below provides a function to check promise state, the throttle async generator, and a simple function that return a promise based on setTimeout. The async IIFE at the end defines the reservoir of timeout values, sets the async iterable returned by throttle, then iterates over the results as they resolve.
If you would like a more complete example for HTTP requests, let me know in the comments.
Please note that Node.js 16+ is required in order async generators.
const promiseState = function( promise ) {
const control = Symbol();
return Promise
.race([ promise, control ])
.then( value => ( value === control ) ? 'pending' : 'fulfilled' )
.catch( () => 'rejected' );
}
const throttle = async function* ( reservoir, promiseClass, highWaterMark ) {
let iterable = reservoir.splice( 0, highWaterMark ).map( item => promiseClass( item ) );
while ( iterable.length > 0 ) {
await Promise.any( iterable );
const pending = [];
const resolved = [];
for ( const currentValue of iterable ) {
if ( await promiseState( currentValue ) === 'pending' ) {
pending.push( currentValue );
} else {
resolved.push( currentValue );
}
}
console.log({ pending, resolved, reservoir });
iterable = [
...pending,
...reservoir.splice( 0, highWaterMark - pending.length ).map( value => promiseClass( value ) )
];
yield Promise.allSettled( resolved );
}
}
const getTimeout = delay => new Promise( ( resolve, reject ) => {
setTimeout(resolve, delay, delay);
} );
( async () => {
const test = [ 1100, 1200, 1300, 10000, 11000, 9000, 5000, 6000, 3000, 4000, 1000, 2000, 3500 ];
const throttledRequests = throttle( test, getTimeout, 4 );
for await ( const timeout of throttledRequests ) {
console.log( timeout );
}
} )();
The concurrent function below will return a Promise which resolves to an array of resolved promise values, while implementing a concurrency limit. No 3rd party library.
// waits 50 ms then resolves to the passed-in arg
const sleepAndResolve = s => new Promise(rs => setTimeout(()=>rs(s), 50))
// queue 100 promises
const funcs = []
for(let i=0; i<100; i++) funcs.push(()=>sleepAndResolve(i))
//run the promises with a max concurrency of 10
concurrent(10,funcs)
.then(console.log) // prints [0,1,2...,99]
.catch(()=>console.log("there was an error"))
/**
* Run concurrent promises with a maximum concurrency level
* #param concurrency The number of concurrently running promises
* #param funcs An array of functions that return promises
* #returns a promise that resolves to an array of the resolved values from the promises returned by funcs
*/
function concurrent(concurrency, funcs) {
return new Promise((resolve, reject) => {
let index = -1;
const p = [];
for (let i = 0; i < Math.max(1, Math.min(concurrency, funcs.length)); i++)
runPromise();
function runPromise() {
if (++index < funcs.length)
(p[p.length] = funcs[index]()).then(runPromise).catch(reject);
else if (index === funcs.length)
Promise.all(p).then(resolve).catch(reject);
}
});
}
Here's the Typescript version if you are interested
/**
* Run concurrent promises with a maximum concurrency level
* #param concurrency The number of concurrently running promises
* #param funcs An array of functions that return promises
* #returns a promise that resolves to an array of the resolved values from the promises returned by funcs
*/
function concurrent<V>(concurrency:number, funcs:(()=>Promise<V>)[]):Promise<V[]> {
return new Promise((resolve,reject)=>{
let index = -1;
const p:Promise<V>[] = []
for(let i=0; i<Math.max(1,Math.min(concurrency, funcs.length)); i++) runPromise()
function runPromise() {
if (++index < funcs.length) (p[p.length] = funcs[index]()).then(runPromise).catch(reject)
else if (index === funcs.length) Promise.all(p).then(resolve).catch(reject)
}
})
}
No external libraries. Just plain JS.
It can be resolved using recursion.
The idea is that initially we immediately execute the maximum allowed number of queries and each of these queries should recursively initiate a new query on its completion.
In this example I populate successful responses together with errors and I execute all queries but it's possible to slightly modify algorithm if you want to terminate batch execution on the first failure.
async function batchQuery(queries, limit) {
limit = Math.min(queries.length, limit);
return new Promise((resolve, reject) => {
const responsesOrErrors = new Array(queries.length);
let startedCount = 0;
let finishedCount = 0;
let hasErrors = false;
function recursiveQuery() {
let index = startedCount++;
doQuery(queries[index])
.then(res => {
responsesOrErrors[index] = res;
})
.catch(error => {
responsesOrErrors[index] = error;
hasErrors = true;
})
.finally(() => {
finishedCount++;
if (finishedCount === queries.length) {
hasErrors ? reject(responsesOrErrors) : resolve(responsesOrErrors);
} else if (startedCount < queries.length) {
recursiveQuery();
}
});
}
for (let i = 0; i < limit; i++) {
recursiveQuery();
}
});
}
async function doQuery(query) {
console.log(`${query} started`);
const delay = Math.floor(Math.random() * 1500);
return new Promise((resolve, reject) => {
setTimeout(() => {
if (delay <= 1000) {
console.log(`${query} finished successfully`);
resolve(`${query} success`);
} else {
console.log(`${query} finished with error`);
reject(`${query} error`);
}
}, delay);
});
}
const queries = new Array(10).fill('query').map((query, index) => `${query}_${index + 1}`);
batchQuery(queries, 3)
.then(responses => console.log('All successfull', responses))
.catch(responsesWithErrors => console.log('All with several failed', responsesWithErrors));
So I tried to make some examples shown work for my code, but since this was only for an import script and not production code, using the npm package batch-promises was surely the easiest path for me
NOTE: Requires runtime to support Promise or to be polyfilled.
Api
batchPromises(int: batchSize, array: Collection, i => Promise: Iteratee)
The Promise: Iteratee will be called after each batch.
Use:
batch-promises
Easily batch promises
NOTE: Requires runtime to support Promise or to be polyfilled.
Api
batchPromises(int: batchSize, array: Collection, i => Promise: Iteratee)
The Promise: Iteratee will be called after each batch.
Use:
import batchPromises from 'batch-promises';
batchPromises(2, [1,2,3,4,5], i => new Promise((resolve, reject) => {
// The iteratee will fire after each batch resulting in the following behaviour:
// # 100ms resolve items 1 and 2 (first batch of 2)
// # 200ms resolve items 3 and 4 (second batch of 2)
// # 300ms resolve remaining item 5 (last remaining batch)
setTimeout(() => {
resolve(i);
}, 100);
}))
.then(results => {
console.log(results); // [1,2,3,4,5]
});
Recursion is the answer if you don't want to use external libraries
downloadAll(someArrayWithData){
var self = this;
var tracker = function(next){
return self.someExpensiveRequest(someArrayWithData[next])
.then(function(){
next++;//This updates the next in the tracker function parameter
if(next < someArrayWithData.length){//Did I finish processing all my data?
return tracker(next);//Go to the next promise
}
});
}
return tracker(0);
}
expanding on the answer posted by #deceleratedcaviar, I created a 'batch' utility function that takes as argument: array of values, concurrency limit and processing function. Yes I realize that using Promise.all this way is more akin to batch processing vs true concurrency, but if the goal is to limit excessive number of HTTP calls at one time I go with this approach due to its simplicity and no need for external library.
async function batch(o) {
let arr = o.arr
let resp = []
while (arr.length) {
let subset = arr.splice(0, o.limit)
let results = await Promise.all(subset.map(o.process))
resp.push(results)
}
return [].concat.apply([], resp)
}
let arr = []
for (let i = 0; i < 250; i++) { arr.push(i) }
async function calc(val) { return val * 100 }
(async () => {
let resp = await batch({
arr: arr,
limit: 100,
process: calc
})
console.log(resp)
})();
One more solution with a custom promise library (CPromise):
using generators Live codesandbox demo
import { CPromise } from "c-promise2";
import cpFetch from "cp-fetch";
const promise = CPromise.all(
function* () {
const urls = [
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=1",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=2",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=3",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=4",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=5",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=6",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=7"
];
for (const url of urls) {
yield cpFetch(url); // add a promise to the pool
console.log(`Request [${url}] completed`);
}
},
{ concurrency: 2 }
).then(
(v) => console.log(`Done: `, v),
(e) => console.warn(`Failed: ${e}`)
);
// yeah, we able to cancel the task and abort pending network requests
// setTimeout(() => promise.cancel(), 4500);
using mapper Live codesandbox demo
import { CPromise } from "c-promise2";
import cpFetch from "cp-fetch";
const promise = CPromise.all(
[
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=1",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=2",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=3",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=4",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=5",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=6",
"https://run.mocky.io/v3/7b038025-fc5f-4564-90eb-4373f0721822?mocky-delay=2s&x=7"
],
{
mapper: (url) => {
console.log(`Request [${url}]`);
return cpFetch(url);
},
concurrency: 2
}
).then(
(v) => console.log(`Done: `, v),
(e) => console.warn(`Failed: ${e}`)
);
// yeah, we able to cancel the task and abort pending network requests
//setTimeout(() => promise.cancel(), 4500);
Warning this has not been benchmarked for efficiency and does a lot of array copying/creation
If you want a more functional approach you could do something like:
import chunk from 'lodash.chunk';
const maxConcurrency = (max) => (dataArr, promiseFn) =>
chunk(dataArr, max).reduce(
async (agg, batch) => [
...(await agg),
...(await Promise.all(batch.map(promiseFn)))
],
[]
);
and then to you could use it like:
const randomFn = (data) =>
new Promise((res) => setTimeout(
() => res(data + 1),
Math.random() * 1000
));
const result = await maxConcurrency(5)(
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
randomFn
);
console.log('result+++', result);
I had been using the bottleneck library, which I actually really liked, but in my case wasn't releasing memory and kept tanking long running jobs... Which isn't great for running the massive jobs that you likely want a throttling/concurrency library for in the first place.
I needed a simple, low-overhead, easy to maintain solution. I also wanted something that kept the pool topped up, rather than simply batching predefined chunks... In the case of a downloader, this will stop that nGB file from holding up your queue for minutes/hours at a time, even though the rest of the batch finished ages ago.
This is the Node.js v16+, no-dependency, async generator solution I've been using instead:
const promiseState = function( promise ) {
// A promise could never resolve to a unique symbol unless it was in this scope
const control = Symbol();
// This helps us determine the state of the promise... A little heavy, but it beats a third-party promise library. The control is the second element passed to Promise.race() since it will only resolve first if the promise being tested is pending.
return Promise
.race([ promise, control ])
.then( value => ( value === control ) ? 'pending' : 'fulfilled' )
.catch( () => 'rejected' );
}
const throttle = async function* ( reservoir, promiseFunction, highWaterMark ) {
let iterable = reservoir.splice( 0, highWaterMark ).map( item => promiseFunction( item ) );
while ( iterable.length > 0 ) {
// When a promise has resolved we have space to top it up to the high water mark...
await Promise.any( iterable );
const pending = [];
const resolved = [];
// This identifies the promise(s) that have resolved so that we can yield them
for ( const currentValue of iterable ) {
if ( await promiseState( currentValue ) === 'pending' ) {
pending.push( currentValue );
} else {
resolved.push( currentValue );
}
}
// Put the remaining promises back into iterable, and top it to the high water mark
iterable = [
...pending,
...reservoir.splice( 0, highWaterMark - pending.length ).map( value => promiseFunction( value ) )
];
yield Promise.allSettled( resolved );
}
}
// This is just an example of what would get passed as "promiseFunction"... This can be the function that returns your HTTP request promises
const getTimeout = delay => new Promise( (resolve, reject) => setTimeout(resolve, delay, delay) );
// This is just the async IIFE that bootstraps this example
( async () => {
const test = [ 1000, 2000, 3000, 4000, 5000, 6000, 1500, 2500, 3500, 4500, 5500, 6500 ];
for await ( const timeout of throttle( test, getTimeout, 4 ) ) {
console.log( timeout );
}
} )();
I have solution with creating chunks and using .reduce function to wait each chunks promise.alls to be finished. And also I add some delay if the promises have some call limits.
export function delay(ms: number) {
return new Promise<void>((resolve) => setTimeout(resolve, ms));
}
export const chunk = <T>(arr: T[], size: number): T[][] => [
...Array(Math.ceil(arr.length / size)),
].map((_, i) => arr.slice(size * i, size + size * i));
const myIdlist = []; // all items
const groupedIdList = chunk(myIdList, 20); // grouped by 20 items
await groupedIdList.reduce(async (prev, subIdList) => {
await prev;
// Make sure we wait for 500 ms after processing every page to prevent overloading the calls.
const data = await Promise.all(subIdList.map(myPromise));
await delay(500);
}, Promise.resolve());
Using tiny-async-pool ES9 for await...of API, you can do the following:
const asyncPool = require("tiny-async-pool");
const getCount = async (user) => ([user, remoteServer.getCount(user)]);
const concurrency = 2;
for await (const [user, count] of asyncPool(concurrency, users, getCount)) {
console.log(user, count);
}
The above asyncPool function returns an async iterator that yields as soon as a promise completes (under concurrency limit) and it rejects immediately as soon as one of the promises rejects.
It is possible to limit requests to server by using https://www.npmjs.com/package/job-pipe
Basically you create a pipe and tell it how many concurrent requests you want:
const pipe = createPipe({ throughput: 6, maxQueueSize: Infinity })
Then you take your function which performs call and force it through the pipe to create a limited amount of calls at the same time:
const makeCall = async () => {...}
const limitedMakeCall = pipe(makeCall)
Finally, you call this method as many times as you need as if it was unchanged and it will limit itself on how many parallel executions it can handle:
await limitedMakeCall()
await limitedMakeCall()
await limitedMakeCall()
await limitedMakeCall()
await limitedMakeCall()
....
await limitedMakeCall()
Profit.
I suggest not downloading packages and not writing hundreds of lines of code:
async function async_arr<T1, T2>(
arr: T1[],
func: (x: T1) => Promise<T2> | T2, //can be sync or async
limit = 5
) {
let results: T2[] = [];
let workers = [];
let current = Math.min(arr.length, limit);
async function process(i) {
if (i < arr.length) {
results[i] = await Promise.resolve(func(arr[i]));
await process(current++);
}
}
for (let i = 0; i < current; i++) {
workers.push(process(i));
}
await Promise.all(workers);
return results;
}
Here's my recipe, based on killdash9's answer.
It allows to choose the behaviour on exceptions (Promise.all vs Promise.allSettled).
// Given an array of async functions, runs them in parallel,
// with at most maxConcurrency simultaneous executions
// Except for that, behaves the same as Promise.all,
// unless allSettled is true, where it behaves as Promise.allSettled
function concurrentRun(maxConcurrency = 10, funcs = [], allSettled = false) {
if (funcs.length <= maxConcurrency) {
const ps = funcs.map(f => f());
return allSettled ? Promise.allSettled(ps) : Promise.all(ps);
}
return new Promise((resolve, reject) => {
let idx = -1;
const ps = new Array(funcs.length);
function nextPromise() {
idx += 1;
if (idx < funcs.length) {
(ps[idx] = funcs[idx]()).then(nextPromise).catch(allSettled ? nextPromise : reject);
} else if (idx === funcs.length) {
(allSettled ? Promise.allSettled(ps) : Promise.all(ps)).then(resolve).catch(reject);
}
}
for (let i = 0; i < maxConcurrency; i += 1) nextPromise();
});
}
I know there are a lot of answers already, but I ended up using a very simple, no library or sleep required, solution that uses only a few commands. Promise.all() simply lets you know when all the promises passed to it are finalized. So, you can check on the queue intermittently to see if it is ready for more work, if so, add more processes.
For example:
// init vars
const batchSize = 5
const calls = []
// loop through data and run processes
for (let [index, data] of [1,2,3].entries()) {
// pile on async processes
calls.push(doSomethingAsyncWithData(data))
// every 5th concurrent call, wait for them to finish before adding more
if (index % batchSize === 0) await Promise.all(calls)
}
// clean up for any data to process left over if smaller than batch size
const allFinishedProcs = await Promise.all(calls)
A good solution for controlling the maximum number of promises/requests is to split your list of requests into pages, and produce only requests for one page at a time.
The example below makes use of iter-ops library:
import {pipeAsync, map, page} from 'iter-ops';
const i = pipeAsync(
users, // make it asynchronous
page(10), // split into pages of 10 items in each
map(p => Promise.all(p.map(u => u.remoteServer.getCount(u)))), // map into requests
wait() // resolve each page in the pipeline
);
// below triggers processing page-by-page:
for await(const p of i) {
//=> p = resolved page of data
}
This way it won't try to create more requests/promises than the size of one page.

Categories

Resources