I am trying to build a wrapper over Notion JS SDK's iteratePaginatedAPI that handles errors as well. I feel particularly lost on how do I catch API errors in such a way that I can actually retry them (aka retry the iteration that failed). Here's my attempt:
async function* queryNotion(listFn, firstPageArgs) {
try {
for await (const result of iteratePaginatedAPI(listFn, firstPageArgs)) {
yield* result
}
} catch (error) {
if (error.code === APIErrorCode.RateLimited) {
console.log('rate_limited');
console.log(error);
sleep(1);
// How would I retry the last iteration?
}
}
}
Coming from the Ruby world, there is a retry in a rescue block. Any help would be appreciated!
Very interesting problem. The issue is that the exception comes from the for await itself, not from its body, so you cannot catch it there. When the exception hits, loops are over.
Note that the iterator might be done after a rejection/exception, in which case there is nothing you can do except starting a new one.
That said, you can always call Iterator.next() yourself and process the result manually. The next() call of an async iterator will return an object like {value: Promise<any>, done: boolean}, and when running it in a loop, you can await the promise in a try..catch and only exit the loop when done becomes true:
async function* queryNotion(listFn, firstPageArgs) {
const asyncGenerator = mockIteratePaginatedAPI(listFn, firstPageArgs)
while (true) {
const current = asyncGenerator.next()
if (current.done) {
break
}
try {
yield* await current.value
} catch (e) {
console.log(`got exception: "${e}" - trying again`)
continue
}
}
}
function* mockIteratePaginatedAPI(a, b) {
for (let i = 0; i < 8; i++) {
yield new Promise((resolve, reject) => setTimeout(() => [3, 5].includes(i) ? reject(`le error at ${i}`) : resolve([i]), 500))
}
}
(async function() {
for await (const n of queryNotion('foo', 'bar')) {
console.log(n)
}
})()
If we keep a reference to the generator, we can also put it back into a for async. This might be easier to read, however a for await ...of will call the iterator's return() when exiting the loop early, likely finishing it, in which case, this will not work:
async function* queryNotion(listFn, firstPageArgs) {
const asyncGenerator = mockIteratePaginatedAPI(listFn, firstPageArgs)
while (true) {
try {
for await (const result of asyncGenerator) {
yield* result
}
break
} catch (e) {
console.log('got exception:', e, 'trying again')
}
}
}
function* mockIteratePaginatedAPI(a, b) {
for (let i = 0; i < 8; i++) {
yield new Promise((resolve, reject) => setTimeout(() => [3, 5].includes(i) ? reject(`le error at ${i}`) : resolve([i]), 500))
}
}
(async function () {
for await (const n of queryNotion('foo', 'bar')) {
console.log(n)
}
})()
Simply add a continue statement inside your if
async function* queryNotion(listFn, firstPageArgs) {
try {
for await (const result of iteratePaginatedAPI(listFn, firstPageArgs)) {
yield* result
}
} catch (error) {
if (error.code === APIErrorCode.RateLimited) {
console.log('rate_limited');
console.log(error);
await sleep(1);
continue; // retry the last iteration
}
}
}
Related
How to convert a dynamic Set<Promise<T>> into AsyncIterable<T> (unordered)?
The resulting iterable must produce values as they get resolved, and it must end just as the source runs empty.
I have a dynamic cache of promises to be resolved, and values reported, disregarding the order.
NOTE: The source is dynamic, which means it can receive new Promise<T> elements while we progress through the resulting iterator.
UPDATE
After going through all the suggestions, I was able to implement my operator. And here're the official docs.
I'm adding a bounty to reward anyone who can improve it further, though at this point a PR is preferable (it is for a public library), or at least something that fits the same protocol.
Judging from your library implementation, you actually want to transform an AsyncIterable<Promise<T>> into an AsyncIterator<T> by racing up to N of the produced promises concurrently. I would implement that as follows:
async function* limitConcurrent<T>(iterable: AsyncIterable<Promise<T>>, n: number): AsyncIterator<T> {
const pool = new Set();
for await (const p of iterable) {
const promise = Promise.resolve(p).finally(() => {
pool.delete(promise); // FIXME see below
});
promise.catch(() => { /* ignore */ }); // mark rejections as handled
pool.add(promise);
if (pool.size >= n) {
yield /* await */ Promise.race(pool);
}
}
while (pool.size) {
yield /* await */ Promise.race(pool);
}
}
Notice that if one of the promises in the pool rejects, the returned iterator will end with the error and the results of the other promises that are currently in the pool will be ignored.
However, above implementation presumes that the iterable is relatively fast, as it will need to produce n promises before the pool is raced for the first time. If it yields the promises slower than the promises take to resolve, the results are held up unnecessarily.
And worse, the above implementation may loose values. If the returned iterator is not consumed fast enough, or the iterable is not yielding fast enough, multiple promise handlers may delete their respective promise from the pool during one iteration of the loop, and the Promise.race will consider only one of them.
So this would work for a synchronous iterable, but if you actually have an asynchronous iterable, you would need a different solution. Essentially you got a consumer and a producer that are more or less independent, and what you need is some queue between them.
Yet with a single queue it still wouldn't handle backpressure, the producer just runs as fast as it can (given the iteration of promises and the concurrency limit) while filling the queue. What you really need then is a channel that allows synchronisation in both directions, e.g. using two queues:
class AsyncQueue<T> {
resolvers: null | ((res: IteratorResult<T> | Promise<never>) => void)[];
promises: Promise<IteratorResult<T>>[];
constructor() {
// invariant: at least one of the arrays is empty.
// when `resolvers` is `null`, the queue has ended.
this.resolvers = [];
this.promises = [];
}
putNext(result: IteratorResult<T> | Promise<never>): void {
if (!this.resolvers) throw new Error('Queue already ended');
if (this.resolvers.length) this.resolvers.shift()(result);
else this.promises.push(Promise.resolve(result));
}
put(value: T): void {
this.putNext({done: false, value});
}
end(): void {
for (const res of this.resolvers) res({done: true, value: undefined});
this.resolvers = null;
}
next(): Promise<IteratorResult<T>> {
if (this.promises.length) return this.promises.shift();
else if (this.resolvers) return new Promise(resolve => { this.resolvers.push(resolve); });
else return Promise.resolve({done: true, value: undefined});
}
[Symbol.asyncIterator](): AsyncIterator<T> {
// Todo: Use AsyncIterator.from()
return this;
}
}
function limitConcurrent<T>(iterable: AsyncIterable<Promise<T>>, n: number): AsyncIterator<T> {
const produced = new AsyncQueue<T>();
const consumed = new AsyncQueue<void>();
(async () => {
try {
let count = 0;
for await (const p of iterable) {
const promise = Promise.resolve(p);
promise.then(value => {
produced.put(value);
}, _err => {
produced.putNext(promise); // with rejection already marked as handled
});
if (++count >= n) {
await consumed.next(); // happens after any produced.put[Next]()
count--;
}
}
while (count) {
await consumed.next(); // happens after any produced.put[Next]()
count--;
}
} catch(e) {
// ignore `iterable` errors?
} finally {
produced.end();
}
})();
return (async function*() {
for await (const value of produced) {
yield value;
consumed.put();
}
}());
}
function createCache() {
const resolve = [];
const sortedPromises = [];
const noop = () => void 0;
return {
get length() {
return sortedPromises.length
},
add(promiseOrValue) {
const q = new Promise(r => {
resolve.push(r);
const _ = () => {
resolve.shift()(promiseOrValue);
}
Promise.resolve(promiseOrValue).then(_, _);
});
q.catch(noop); // prevent q from throwing when rejected.
sortedPromises.push(q);
},
next() {
return sortedPromises.length ?
{ value: sortedPromises.shift() } :
{ done: true };
},
[Symbol.iterator]() {
return this;
}
}
}
(async() => {
const sleep = (ms, value) => new Promise(resolve => setTimeout(resolve, ms, value));
const cache = createCache();
const start = Date.now();
function addItem() {
const t = Math.floor(Math.random() ** 2 * 8000), // when to resolve
val = t + Date.now() - start; // ensure that the resolved value is in ASC order.
console.log("add", val);
cache.add(sleep(t, val));
}
// add a few initial items
Array(5).fill().forEach(addItem);
// check error handling with a rejecting promise.
cache.add(sleep(1500).then(() => Promise.reject("a rejected Promise")));
while (cache.length) {
try {
for await (let v of cache) {
console.log("yield", v);
if (v < 15000 && Math.random() < .5) {
addItem();
}
// slow down iteration, like if you'd await some API-call.
// promises now resolve faster than we pull them.
await sleep(1000);
}
} catch (err) {
console.log("error:", err);
}
}
console.log("done");
})()
.as-console-wrapper{top:0;max-height:100%!important}
works with both for(const promise of cache){ ... } and for await(const value of cache){ ... }
Error-handling:
for(const promise of cache){
try {
const value = await promise;
}catch(error){ ... }
}
// or
while(cache.length){
try {
for await(const value of cache){
...
}
}catch(error){ ... }
}
rejected Promises (in the cache) don't throw until you .then() or await them.
Also handles backpressure (when your loop is iterating slower than the promises resolve)
for await(const value of cache){
await somethingSlow(value);
}
Imagine we have an async generator function:
async f * (connection) {
while (true) {
...
await doStuff()
yield value
}
}
Suppose that this function is virtually endless and gives us results of some async actions. We want to iterate these results:
for await (const result of f(connection)) {
...
}
Now imagine we want to break out of this for-await loop when some timeout ends and clean things up:
async outerFunc() {
setTimeout(() => connection.destroy(), TIMEOUT_MS)
for await (const result of f(connection)) {
...
if (something) {
return 'end naturally'
}
}
}
Assume that connection.destroy() ends the execution of f and ends the for-await loop. Now it would be great to return some value from the outerFunc when we end by timeout. The first thought is wrapping in a Promise:
async outerFunc() {
return await new Promise((resolve, reject) => {
setTimeout(() => {
connection.destroy()
resolve('end by timeout')
}, TIMEOUT_MS)
for await (const result of f(connection)) { // nope
...
if (something) {
resolve('end naturally')
}
}
})
}
But we cannot use awaits inside Promise and we cannot make the function async due to this antipattern
The question is: how do we return by timeout the right way?
It gets much easier, if you use an existing library that can handle asynchronous generators and timeouts automatically. The example below is using library iter-ops for that:
import {pipe, timeout} from 'iter-ops';
// test async endless generator:
async function* gen() {
let count = 0;
while (true) {
yield count++; // endless increment generator
}
}
const i = pipe(
gen(), // your generator
timeout(5, () => {
// 5ms has timed out, do disconnect or whatever
})
); //=> AsyncIterable<number>
// test:
(async function () {
for await(const a of i) {
console.log(a); // display result
}
})();
Assume that connection.destroy() ends the execution of f and ends the for-await loop.
In that case, just place your return statement so that it is executed when the loop ends:
async outerFunc() {
setTimeout(() => {
connection.destroy()
}, TIMEOUT_MS)
for await (const result of f(connection)) {
...
if (something) {
return 'end naturally'
}
}
return 'end by timeout'
}
Well, I am lost in await and async hell. The code below is supposed to loop through a list of files, check if they exist and return back the ones that do exist. But I am getting a zero length list.
Node V8 code: caller:
await this.sourceList()
if (this.paths.length == 0) {
this.abort = true
return
}
Called Functions: (I took out stuff not relevant)
const testPath = util.promisify(fs.access)
class FMEjob {
constructor(root, inFiles, layerType, ticket) {
this.paths = []
this.config = global.app.settings.config
this.sourcePath = this.config.SourcePath
}
async sourceList() {
return await Promise.all(this.files.map(async (f) => {
let source = path.join(this.sourcePath, f.path)
return async () => {
if (await checkFile(source)) {
this.paths.push(source)
}
}
}))
}
async checkFile(path) {
let result = true
try {
await testPath(path, fs.constants.R_OK)
}
catch (err) {
this.errors++
result = false
logger.addLog('info', 'FMEjob.checkFile(): File Missing Error: %s', err.path)
}
return result
}
Your sourceList function is really weird. It returns a promise for an array of asynchronous functions, but it never calls those. Drop the arrow function wrapper.
Also I recommend to never mutate instance properties inside async methods, that'll cause insane bugs when multiple methods are executed concurrently.
this.paths = await this.sourceList()
if (this.abort = (this.paths.length == 0)) {
return
}
async sourceList() {
let paths = []
await Promise.all(this.files.map(async (f) => {
const source = path.join(this.sourcePath, f.path)
// no function here, no return here!
if (await this.checkFile(source)) {
paths.push(source)
}
}))
return paths
}
async checkFile(path) {
try {
await testPath(path, fs.constants.R_OK)
return true
} catch (err) {
logger.addLog('info', 'FMEjob.checkFile(): File Missing Error: %s', err.path)
this.errors++ // questionable as well - better let `sourceList` count these
}
return false
}
I'm looking for a promise function wrapper that can limit / throttle when a given promise is running so that only a set number of that promise is running at a given time.
In the case below delayPromise should never run concurrently, they should all run one at a time in a first-come-first-serve order.
import Promise from 'bluebird'
function _delayPromise (seconds, str) {
console.log(str)
return Promise.delay(seconds)
}
let delayPromise = limitConcurrency(_delayPromise, 1)
async function a() {
await delayPromise(100, "a:a")
await delayPromise(100, "a:b")
await delayPromise(100, "a:c")
}
async function b() {
await delayPromise(100, "b:a")
await delayPromise(100, "b:b")
await delayPromise(100, "b:c")
}
a().then(() => console.log('done'))
b().then(() => console.log('done'))
Any ideas on how to get a queue like this set up?
I have a "debounce" function from the wonderful Benjamin Gruenbaum. I need to modify this to throttle a promise based on it's own execution and not the delay.
export function promiseDebounce (fn, delay, count) {
let working = 0
let queue = []
function work () {
if ((queue.length === 0) || (working === count)) return
working++
Promise.delay(delay).tap(function () { working-- }).then(work)
var next = queue.shift()
next[2](fn.apply(next[0], next[1]))
}
return function debounced () {
var args = arguments
return new Promise(function (resolve) {
queue.push([this, args, resolve])
if (working < count) work()
}.bind(this))
}
}
I don't think there are any libraries to do this, but it's actually quite simple to implement yourself:
function sequential(fn) { // limitConcurrency(fn, 1)
let q = Promise.resolve();
return function(x) {
const p = q.then(() => fn(x));
q = p.reflect();
return p;
};
}
For multiple concurrent requests it gets a little trickier, but can be done as well.
function limitConcurrency(fn, n) {
if (n == 1) return sequential(fn); // optimisation
let q = Promise.resolve();
const active = new Set();
const fst = t => t[0];
const snd = t => t[1];
return function(x) {
function put() {
const p = fn(x);
const a = p.reflect().then(() => {
active.delete(a);
});
active.add(a);
return [Promise.race(active), p];
}
if (active.size < n) {
const r = put()
q = fst(t);
return snd(t);
} else {
const r = q.then(put);
q = r.then(fst);
return r.then(snd)
}
};
}
Btw, you might want to have a look at the actors model and CSP. They can simplify dealing with such things, there are a few JS libraries for them out there as well.
Example
import Promise from 'bluebird'
function sequential(fn) {
var q = Promise.resolve();
return (...args) => {
const p = q.then(() => fn(...args))
q = p.reflect()
return p
}
}
async function _delayPromise (seconds, str) {
console.log(`${str} started`)
await Promise.delay(seconds)
console.log(`${str} ended`)
return str
}
let delayPromise = sequential(_delayPromise)
async function a() {
await delayPromise(100, "a:a")
await delayPromise(200, "a:b")
await delayPromise(300, "a:c")
}
async function b() {
await delayPromise(400, "b:a")
await delayPromise(500, "b:b")
await delayPromise(600, "b:c")
}
a().then(() => console.log('done'))
b().then(() => console.log('done'))
// --> with sequential()
// $ babel-node test/t.js
// a:a started
// a:a ended
// b:a started
// b:a ended
// a:b started
// a:b ended
// b:b started
// b:b ended
// a:c started
// a:c ended
// b:c started
// done
// b:c ended
// done
// --> without calling sequential()
// $ babel-node test/t.js
// a:a started
// b:a started
// a:a ended
// a:b started
// a:b ended
// a:c started
// b:a ended
// b:b started
// a:c ended
// done
// b:b ended
// b:c started
// b:c ended
// done
Use the throttled-promise module:
https://www.npmjs.com/package/throttled-promise
var ThrottledPromise = require('throttled-promise'),
promises = [
new ThrottledPromise(function(resolve, reject) { ... }),
new ThrottledPromise(function(resolve, reject) { ... }),
new ThrottledPromise(function(resolve, reject) { ... })
];
// Run promises, but only 2 parallel
ThrottledPromise.all(promises, 2)
.then( ... )
.catch( ... );
I have the same problem. I wrote a library to implement it. Code is here. I created a queue to save all the promises. When you push some promises to the queue, the first several promises at the head of the queue would be popped and running. Once one promise is done, the next promise in the queue would also be popped and running. Again and again, until the queue has no Task. You can check the code for details. Hope this library would help you.
Advantages
you can define the amount of concurrent promises (near simultaneous requests)
consistent flow: once one promise resolve, another request start no need to guess the server capability
robust against data choke, if the server stop for a moment, it will just wait, and next tasks will not start just because the
clock allowed
do not rely on a 3rd party module it is Vanila node.js
1st thing is to make https a promise, so we can use wait to retrieve data (removed from the example)
2nd create a promise scheduler that submit another request as any promise get resolved.
3rd make the calls
Limiting requests taking by limiting the amount of concurrent promises
const https = require('https')
function httpRequest(method, path, body = null) {
const reqOpt = {
method: method,
path: path,
hostname: 'dbase.ez-mn.net',
headers: {
"Content-Type": "application/json",
"Cache-Control": "no-cache"
}
}
if (method == 'GET') reqOpt.path = path + '&max=20000'
if (body) reqOpt.headers['Content-Length'] = Buffer.byteLength(body);
return new Promise((resolve, reject) => {
const clientRequest = https.request(reqOpt, incomingMessage => {
let response = {
statusCode: incomingMessage.statusCode,
headers: incomingMessage.headers,
body: []
};
let chunks = ""
incomingMessage.on('data', chunk => { chunks += chunk; });
incomingMessage.on('end', () => {
if (chunks) {
try {
response.body = JSON.parse(chunks);
} catch (error) {
reject(error)
}
}
console.log(response)
resolve(response);
});
});
clientRequest.on('error', error => { reject(error); });
if (body) { clientRequest.write(body) }
clientRequest.end();
});
}
const asyncLimit = (fn, n) => {
const pendingPromises = new Set();
return async function(...args) {
while (pendingPromises.size >= n) {
await Promise.race(pendingPromises);
}
const p = fn.apply(this, args);
const r = p.catch(() => {});
pendingPromises.add(r);
await r;
pendingPromises.delete(r);
return p;
};
};
// httpRequest is the function that we want to rate the amount of requests
// in this case, we set 8 requests running while not blocking other tasks (concurrency)
let ratedhttpRequest = asyncLimit(httpRequest, 8);
// this is our datase and caller
let process = async () => {
patchData=[
{path: '/rest/slots/80973975078587', body:{score:3}},
{path: '/rest/slots/809739750DFA95', body:{score:5}},
{path: '/rest/slots/AE0973750DFA96', body:{score:5}}]
for (let i = 0; i < patchData.length; i++) {
ratedhttpRequest('PATCH', patchData[i].path, patchData[i].body)
}
console.log('completed')
}
process()
The classic way of running async processes in series is to use async.js and use async.series(). If you prefer promise based code then there is a promise version of async.js: async-q
With async-q you can once again use series:
async.series([
function(){return delayPromise(100, "a:a")},
function(){return delayPromise(100, "a:b")},
function(){return delayPromise(100, "a:c")}
])
.then(function(){
console.log(done);
});
Running two of them at the same time will run a and b concurrently but within each they will be sequential:
// these two will run concurrently but each will run
// their array of functions sequentially:
async.series(a_array).then(()=>console.log('a done'));
async.series(b_array).then(()=>console.log('b done'));
If you want to run b after a then put it in the .then():
async.series(a_array)
.then(()=>{
console.log('a done');
return async.series(b_array);
})
.then(()=>{
console.log('b done');
});
If instead of running each sequentially you want to limit each to run a set number of processes concurrently then you can use parallelLimit():
// Run two promises at a time:
async.parallelLimit(a_array,2)
.then(()=>console.log('done'));
Read up the async-q docs: https://github.com/dbushong/async-q/blob/master/READJSME.md
I've been trying to get the rejects of my asynchronous functions to bubble back up to their callers, but it's not working for some reason. Here's some tested example code:
"use strict";
class Test {
constructor() {
this.do1();
}
async do1() {
try { this.do2(); } catch(reason) { console.error(reason); }
}
async do2() {
for(let i = 0; i < 10; i++) {
await this.do3();
console.log(`completed ${i}`);
}
console.log("finished do1");
}
async do3() {
return new Promise((resolve, reject) => {
setTimeout(() => {
if(Math.random() < 0.3) reject('###rejected');
else resolve("###success");
}, 1000);
});
}
}
export default Test;
Chrome just gives me this every time: Unhandled promise rejection ###rejected.
Any idea why this is happening? I'd like to be able to handle all thrown errors from a higher level than do2() (the above example works fine if the try/catch is in do2() and wraps await this.do3();). Thanks!
Edit: To be a bit more explicit, if I take the try/catch out of do1() and put it in do2() as follows, everything works fine:
async do2() {
try {
for(let i = 0; i < 10; i++) {
await this.do3();
console.log(`completed ${i}`);
}
console.log("finished do1");
} catch(reason) { console.error(reason); }
}
async do1() {
try {
await this.do2();
}
catch(reason) {
console.error(reason);
}
}
do2 is an asynchronous function. And you call it without await. So, when it completes there's no try-catch clauses around it.
See this question and this article for more details.