Trying to block-sleep in some js/node code using setTimeout - javascript

Given the following code which I run in the command line with node fileName.js it seems to run all the items in the loop and THEN sleep at the end ... sorta like it's all running parallel or something.
I would like the code to block/pause during the setTimeout instead of just running the function AFTER the setTimeout is complete. Or, use a different method if setTimeout is the incorrect one, in this use case.
const removeUsers = async () => {
const users = await db.getUsers(); // returns 32 users.
// Split the users up into an array, with 2 users in each 'slot'.
var arrays = [], size = 2;
while (users.length > 0) {
arrays.push(users.splice(0, size));
}
// Now, for each slot, delete the 2 users then pause for 1 sec.
arrays.forEach(a => {
console.log(++counter;);
// Delete 2x users.
a.forEach(async u => {
console.log('Deleting User: ' + u.id);
await 3rdPartyApi.deleteUser({id: u.id});
});
// Now pause for a second.
// Why? 3rd party api has a 2 hits/sec rate throttling.
setTimeout(function () { console.log('Sleeping for 1 sec'); }, 1000);
});
}
and the logs are like this..
1.
Deleting User: 1
Deleting User: 2
2.
Deleting User: 3
Deleting User: 4
3.
...
(sleep for 1 sec)
(sleep for 1 sec)
(sleep for 1 sec)
...
end.
See how the sleep doesn't feel like it blocks.. it just fires off a sleep command which then gets handled after a sec...
This is what I'm really after...
1.
Deleting User: 1
Deleting User: 2
(sleep for 1 sec)
2.
Deleting User: 3
Deleting User: 4
(sleep for 1 sec).
3.
...
end.

This calls a bunch of async functions. They each return a promise (async functions always return promises), and those promises are discarded, because Array#forEach doesn’t do anything with the return value of the function it’s passed.
a.forEach(async u => {
console.log('Deleting User: ' + u.id);
await 3rdPartyApi.deleteUser({id: u.id});
});
This starts a timer and doesn’t even attempt to wait for it.
setTimeout(function () { console.log('Sleeping for 1 sec'); }, 1000);
Split off the timer into a function that returns a promise resolving in the appropriate amount of time (available as Promise.delay if you’re using Bluebird, which you should be):
const delay = ms =>
new Promise(resolve => {
setTimeout(resolve, ms);
});
and keep everything in one async function so you’re not discarding any promises:
function* chunk(array, size) {
for (let i = 0; i < array.length;) {
yield array.slice(i, i += size);
}
}
const removeUsers = async () => {
const users = await db.getUsers(); // returns 32 users.
for (const a of chunk(users, 2)) {
console.log(++counter);
// Delete 2x users.
for (const u of a) {
console.log('Deleting User: ' + u.id);
await ThirdPartyApi.deleteUser({id: u.id});
}
console.log('Sleeping for 1 sec');
await delay(1000);
}
};

Related

Sleep / delay inside promise.all

I am building a backend to handle pulling data from a third party API.
There are three large steps to this, which are:
Delete the existing db data (before any new data is inserted)
Get a new dataset from the API
Insert that data.
Each of these three steps must happen for a variety of datasets - i.e. clients, appointments, products etc.
To handle this, I have three Promise.all functions, and each of these are being passed individual async functions for handling the deleting, getting, and finally inserting of the data. I have this code working just for clients so far.
What I'm now trying to do is limit the API calls, as the API I am pulling data from can only accept up to 200 calls per minute. To quickly test the rate limiting functionality in code I have set it to a max of 5 api calls per 10 seconds, so I can see if it's working properly.
This is the code I have so far - note I have replaced the name of the system in the code with 'System'. I have not included all code as there's a lot of data that is being iterated through further down.
let patientsCombinedData = [];
let overallAPICallCount = 0;
let maxAPICallsPerMinute = 5;
let startTime, endTime, timeDiff, secondsElapsed;
const queryString = `UPDATE System SET ${migration_status_column} = 'In Progress' WHERE uid = '${uid}'`;
migrationDB.query(queryString, (err, res) => {
async function deleteSystemData() {
async function deleteSystemPatients() {
return (result = await migrationDB.query("DELETE FROM System_patients WHERE id_System_account = ($1) AND migration_type = ($2)", [
System_account_id,
migrationType,
]));
}
await Promise.all([deleteSystemPatients()]).then(() => {
startTime = new Date(); // Initialise timer before kicking off API calls
async function sleep(ms) {
return new Promise((resolve) => setTimeout(resolve, ms));
}
async function getSystemAPIData() {
async function getSystemPatients() {
endTime = new Date();
timeDiff = endTime - startTime;
timeDiff /= 1000;
secondsElapsed = Math.round(timeDiff);
if (secondsElapsed < 10) {
if (overallAPICallCount > maxAPICallsPerMinute) {
// Here I want to sleep for one second, then check again as the timer may have passed 10 seconds
getSystemPatients();
} else {
// Proceed with calls
dataInstance = await axios.get(`${patientsPage}`, {
headers: {
Authorization: completeBase64String,
Accept: "application/json",
"User-Agent": "TEST_API (email#email.com)",
},
});
dataInstance.data.patients.forEach((data) => {
patientsCombinedData.push(data);
});
overallAPICallCount++;
console.log(`Count is: ${overallAPICallCount}. Seconds are: ${secondsElapsed}. URL is: ${dataInstance.data.links.self}`);
if (dataInstance.data.links.next) {
patientsPage = dataInstance.data.links.next;
await getSystemPatients();
} else {
console.log("Finished Getting Clients.");
return;
}
}
} else {
console.log(`Timer reset! Now proceed with API calls`);
startTime = new Date();
overallAPICallCount = 0;
getSystemPatients();
}
}
await Promise.all([getSystemPatients()]).then((response) => {
async function insertSystemData() {
async function insertClinkoPatients() {
const SystemPatients = patientsCombinedData;
Just under where it says ' if (secondsElapsed < 10) ' is where I want to check the code every second to see if the timer has passed 10 seconds, in which case the timer and the count will be reset, so I can then start counting again over the next 10 seconds. Currently the recursive function is running so often that an error displayed related to the call stack.
I have tried to add a variety of async timer functions here but every time the function is returned it causes the parent promise to finish executing.
Hope that makes sense
I ended up using the Bottleneck library, which made it very easy to implement rate limiting.
const Bottleneck = require("bottleneck/es5");
const limiter = new Bottleneck({
minTime: 350
});
await limiter.schedule(() => getSystemPatients());

Why does setInterval never run, in my NodeJs code that streams an generator to file?

I have this situation in my NodeJs code, which calculates permutations (code from here), but no matter what I don't get any output from setInterval.
const { Readable } = require('stream');
const { intervalToDuration, formatDuration, format } = require('date-fns');
const { subsetPerm } = require('./permutation');
function formatLogs(counter, permStart) {
const newLocal = new Date();
const streamTime = formatDuration(intervalToDuration({
end: newLocal.getTime(),
start: permStart.getTime()
}));
const formattedLogs = `wrote ${counter.toLocaleString()} patterns, after ${streamTime}`;
return formattedLogs;
}
const ONE_MINUTES_IN_MS = 1 * 60 * 1000;
let progress = 0;
let timerCallCount = 1;
let start = new Date();
const interval = setInterval(() => {
console.log(formatLogs(progress, start));
}, ONE_MINUTES_IN_MS);
const iterStream = Readable.from(subsetPerm(Object.keys(Array.from({ length: 200 })), 5));
console.log(`Stream started on: ${format(start, 'PPPPpppp')}`)
iterStream.on('data', () => {
progress++;
if (new Date().getTime() - start.getTime() >= (ONE_MINUTES_IN_MS * timerCallCount)) {
console.log(`manual timer: ${formatLogs(progress, start)}`)
timerCallCount++;
if (timerCallCount >= 3) iterStream.destroy();
}
});
iterStream.on('error', err => {
console.log(err);
clearInterval(interval);
});
iterStream.on('close', () => {
console.log(`closed: ${formatLogs(progress, start)}`);
clearInterval(interval);
})
console.log('done!');
But what I find is that it prints 'done!' (expected) and then the script seems to end, even though if I put a console.log in my on('data') callback I get data printed to the terminal. But even hours later the console.log in the setInterval never runs, as nothing ends up on file, besides the output from the on('close',...).
The output log looks like:
> node demo.js
Stream started on: Sunday, January 30th, 2022 at 5:40:50 PM GMT+00:00
done!
manual timer: wrote 24,722,912 patterns, after 1 minute
manual timer: wrote 49,503,623 patterns, after 2 minutes
closed: wrote 49,503,624 patterns, after 2 minutes
The timers in node guide has a section called 'leaving timeouts behind' which looked relevant. But where I though using interval.ref(); told the script to not garbage collect the object until .unref() is called on the same timeout object, on second reading that's not quite right, and doesn't make a difference.
I'm running this using npm like so npm run noodle which just points to the file.
The generator is synchronous and blocks the event loop
Readable.from processes the whole generator in one go, so if the generator is synchronous and long running it blocks the event loop.
Here is the annotated code that it runs:
async function next() {
for (;;) {
try {
const { value, done } = isAsync ?
await iterator.next() : // our generator is not asynchronous
iterator.next();
if (done) {
readable.push(null); // generator not done
} else {
const res = (value &&
typeof value.then === 'function') ?
await value :
value; // not a thenable
if (res === null) {
reading = false;
throw new ERR_STREAM_NULL_VALUES();
} else if (readable.push(res)) { // readable.push returns false if it's been paused, or some other irrelevant cases.
continue; // we continue to the next item in the iterator
} else {
reading = false;
}
}
} catch (err) {
readable.destroy(err);
}
break;
}
}
Here is the api for readable.push, which explains how this keeps the generator running:
Returns: true if additional chunks of data may continue to be pushed; false otherwise.
Nothing has told NodeJs not to continue pushing data, so it carries on.
Between each run of the event loop, Node.js checks if it is waiting for any asynchronous I/O or timers and shuts down cleanly if there are not any.
I raised this as a NodeJs Github Issue and ended up workshopping this solution:
cosnt yieldEvery = 1e5;
function setImmediatePromise() {
return new Promise(resolve => setImmediate(resolve));
}
const iterStream = Readable.from(async function* () {
let i = 0
for await (const item of baseGenerator) {
yield item;
i++;
if (i % yieldEvery === 0) await setImmediatePromise();
}
}());
This is partly inspired by this snyk.io blog, which goes into more detail on this issue.

Node, wait and retry api calls that fail

So I fetch an array of urls from api with a rate limit, currently I handle this by adding a timeout to each call like this:
const calls = urls.map((url, i) =>
new Promise(resolve => setTimeout(resolve, 250 * i))
.then(() => fetch(url)
)
);
const data = await Promise.all(calls);
forcing a 250ms wait between each call. This ensures that the rate limit is never exceeded.
The thing is, this isn't really necessary. I've tried with 0ms wait time, and most of the cases I have to repeatedly reload the page four or five times before the api starts to return:
{ error: { status: 429, message: 'API rate limit exceeded' } }
and most of the times you only have to wait a second or so before you can safely reload the page and get all data.
A more reasonable approach would be to collect the calls that return 429 (if they do), wait for a set amount of time and then retry them (and perhaps redo this a set amount of times).
Problem, I'm a bit stumped as to how one would go about achieving this?
EDIT:
Just got home and will look through the answers but there seem to have been an assumption made which I don't believe is necessary: The calls does not have to be sequential, they can be fired (and returned) in any order.
The term for what you want is exponential backoff. You can modify your code so that it continues trying on a certain failure condition:
const max_wait = 2000;
async function wait(ms) {
return new Promise(resolve => {
setTimeout(resolve, ms);
});
}
const calls = urls.map(async (url) => {
let retry = 0, result;
do {
if (retry !== 0) { await wait(Math.pow(2, retry); }
result = await fetch(url);
retry++;
} while(result.status !== 429 || (Math.pow(2, retry) > max_wait))
return result;
}
Or you can try using a library to handle the backoff for you like https://github.com/MathieuTurcotte/node-backoff
If I understand the question right, your trying to:
a) Execute fetch() calls sequentially (with a possibly optional delay)
b) Retry failed requests with a backoff delay
As you likely found out, .map() does not really help with a) as it does not wait for any async stuff when iterating (which is why you create a greater and greater timeout with i*250).
I personally find it the easiest to keep things sequential by using a for of loop instead, as this will work nicely with async/await:
const fetchQueue = async (urls, delay = 0, retries = 0, maxRetries = 3) => {
const wait = (timeout = 0) => {
if (timeout) { console.log(`Waiting for ${timeout}`); }
return new Promise(resolve => {
setTimeout(resolve, timeout);
});
};
for (url of urls) {
try {
await wait(retries ? retries * Math.max(delay, 1000) : delay);
let response = await fetch(url);
let data = await (
response.headers.get('content-type').includes('json')
? response.json()
: response.text()
);
response = {
headers: [...response.headers].reduce((acc, header) => {
return {...acc, [header[0]]: header[1]};
}, {}),
status: response.status,
data: data,
};
// in reality, only do that for errors
// that make sense to retry
if ([404, 429].includes(response.status)) {
throw new Error(`Status Code ${response.status}`);
}
console.log(response.data);
} catch(err) {
console.log('Error:', err.message);
if (retries < maxRetries) {
console.log(`Retry #${retries+1} ${url}`);
await fetchQueue([url], delay, retries+1, maxRetries);
} else {
console.log(`Max retries reached for ${url}`);
}
}
}
};
// populate some real URLs urls to fetch
// index 0 will generate an inexistent URL to test error behaviour
const urls = new Array(101).fill(null).map((x, i) => `https://jsonplaceholder.typicode.com/todos/${i}`);
// fetch urls one after another (sequentially)
// and delay each request by 250ms
fetchQueue(urls, 250);
If a request fails (e.g. you get one of the errors specified in the array with error status codes), the above function will retry them a maximum of 3 times (by default) with a backoff delay that increases by a second on each retry.
As you wrote, the delay between requests is probably not necessary, so you could just remove the 250 in the function call. Because each request is executed one after the other, you're less likely to run into rate limit issues but if you do, it's very easy to add some custom delay.
Here is an example that allows to handle an array of promises sequencially, by setting a delay expressed in milliseconds and accepting a third callback determining whether the request should be retried.
In the below code, some sample requests are mocked to:
Test a successful response.
Test an error response. If the error response contains an error code and the error code is 403, true is returned and the call is retried in the next run (delayed by x milliseconds).
Test an error response without an error code.
There is a global counter below that give up the promise after N tries (in the below example 5), all of that is handled in this code:
const result = await resolveSequencially(promiseTests, 250, (err) => {
return ++errorCount, !!(err && err.error && err.error.status === 403 && errorCount <= 5);
});
Where the error count is first increased and it returns true if the error is defined, has an error property and its status is 403.
Of course, the example is just to test things out, but I think you're looking for something allowing you to have a cleverer control over the promise loop cycle, hence here is a solution doing just that.
I will add some comments below, you can run the test below to check what happens directly in the console.
// Nothing that relevant, this one is just for testing purposes!
let errorCount = 0;
// Declare the function.
const resolveSequencially = (promises, delay, onFailed, onFinished) => {
// store the results.
const results = [];
// Define a self invoking recursiveHandle function.
(recursiveHandle = (current, max) => { // current is the index of the currently looped promise, max is the maximum needed.
console.log('recursiveHandle invoked, current is, ', current ,'max is', max);
if (current === max) onFinished(results); // <-- if all the promises have been looped, resolve.
else {
// Define a method to handle the promise.
let handlePromise = () => {
console.log('about to handle promise');
const p = promises[current];
p.then((success) => {
console.log('success invoked!');
results.push(success);
// if it's successfull, push the result and invoke the next element.
recursiveHandle(current + 1, max);
}).catch((err) => {
console.log('An error was catched. Invoking callback to check whether I should retry! Error was: ', err);
// otherwise, invoke the onFailed callback.
const retry = onFailed(err);
// if retry is true, invoke again the recursive function with the same indexes.
console.log('retry is', retry);
if (retry) recursiveHandle(current, max);
else recursiveHandle(current + 1, max); // <-- otherwise, procede regularly.
});
};
if (current !== 0) setTimeout(() => { handlePromise() }, delay); // <-- if it's not the first element, invoke the promise after the desired delay.
else handlePromise(); // otherwise, invoke immediately.
}
})(0, promises.length); // Invoke the IIFE with a initial index 0, and a maximum index which is the length of the promise array.
}
const promiseTests = [
Promise.resolve(true),
Promise.reject({
error: {
status: 403
}
}),
Promise.resolve(true),
Promise.reject(null)
];
const test = () => {
console.log('about to invoke resolveSequencially');
resolveSequencially(promiseTests, 250, (err) => {
return ++errorCount, !!(err && err.error && err.error.status === 403 && errorCount <= 5);
}, (done) => {
console.log('finished! results are:', done);
});
};
test();

Repeat function itself without setInterval based on its finish

I need my program to repeat itself continuously. My program starts to fetch proxies from servers and saves them to a database and then send those saved proxies to another server again. So I don't know how long it takes for my program to do this task.
I wanna know what happens if any problem happens that makes this startJob() function take more than 30 seconds.
Does setInterval call it again or waits for function to finish?
What's the best approach for my program to repeat itself after it's done without setInterval?
(for exmaple startJob() being called again after it's done.)
I was wondering if it is ok to put this function in a loop with a big number like:
for ( let i = 0 ; i < 999999999 ; i ++ ) {
await startJob()
}
Here is my code:
const startJob = async () => {
await postProxyToChannel()
grabProxies()
}
setInterval(function(){
startJob()
}, (30000))
grabProxies() takes about 10 seconds and postProxyToChannel() takes about 5 seconds on my server.
No matter what happens inside startJob, setInterval will call it every 30 seconds. This means that postProxyToChannel will be called every 30 seconds. If that function throws, you'll get an unhandled Promise rejection, but the interval will continue.
Even if postProxyToChannel takes, say, 45 seconds, that won't prevent startJob from being called again before the prior startJob has completed.
If you want to make sure that startJob is only called 30 seconds after it finishes, you could await it in your for loop, then await a Promise that resolves every 30 seconds:
(async () => {
for ( let i = 0 ; i < 999999999 ; i ++ ) {
await startJob();
await new Promise(resolve => setTimeout(resolve, 30000));
}
})()
.catch((err) => {
console.log('There was an error', err);
});
But it would probably make more sense just to have a recursive call of startJob, eg:
const startJob = async () => {
try {
await postProxyToChannel();
} catch(e) {
// handle error
}
grabProxies();
setTimeout(startJob, 30000);
};
startJob();
Yup an infinite loop sounds good, that can be compared with a timer to pause the loop:
const timer = ms => new Promise(resolve => setTimeout(resolve, ms));
(async function() {
while(true) {
await postProxyToChannel();
await grabProxies();
await timer(30000);
}
})();
Now that loop will run the task, wait 30secs, then do that again. Therefore the loop will not run every 30secs but will usually take longer. To adjust that, you could measure the time the task took, then await the rest of the time:
const start = Date.now();
await postProxyToChannel();
await grabProxies();
await timer(30000 - (Date.now() - start));

While Loop inside ASYNC AWAIT

I have some code that continuously updates a series of objects via network calls looks like this. I was wondering if this is bad practice and if there might be a better way. I cant use Set Interval as the time between MakeAsyncCall replies is variable and can cause a leak if the time to make the call is longer than the delay. I will be using this info to update a UI. Will this cause blocking? What are your thoughts? Let me know if you need more info.
let group = [item1, item2, item3];
// Start Loop
readForever(group, 100);
// Function to Delay X ms
const delay = ms => {
return new Promise((resolve, _) => {
const timeout = setTimeout(() => {
resolve();
}, ms);
});
};
// Function to continuously Make Calls
const readForever = async (group, ms) => {
while(true) {
// Make Async Call
for (let item of group) {
await MakeAsyncCall(item);
}
// Wait X ms Before Processing Continues
await delay(ms);
}
};
The given code won't cause any UI blocking. And is a valid way to update the UI continually.
Instead of a loop you could write it that way:
const readForever = async (group, ms) => {
// Make Async Call
for (let item of group) {
await MakeAsyncCall(item);
}
// Wait X ms Before Processing Continues
await delay(ms);
if (true) { // not needed, but there you could define an end condition
return readForever(group, ms);
}
};
In addition to the comment about the delay function:
You could directly pass the resolve to setTimeout, and because you do not cancel the Timeout anywhere you do not need to store the result setTimeout in a variable.
const delay = ms => {
return new Promise(resolve => setTimeout(resolve, ms));
};

Categories

Resources