Rate limiting/Sleeping/Delaying nodejs without busy waiting - javascript

I am looking to set a delay in making http requests to avoid going over the rate limit of the external server.
users.forEach(async function(user) {
await rate_check()
make_http_request()
})
I need help with implementing the rate_check function in a way that would avoid busy waiting. At the moment, I am busy waiting as follows
async function rate_check() {
if(rate_counter < rate_limit)
rate_counter += 1
else {
// Busy wait
while(new Date() - rate_0_time < 1000) {}
rate_counter = 1
time_delta = new Date() - rate_0_time
rate_1_time = new Date()
}
}
await new Promise(resolve => { setTimeout(resolve, 2000)}) does not work as it would only cause rate_check to sleep, but the anonymous function would continue to make requests.
Any rate checking code must be done in the rate_check function and not in the function where the http request happens as requests happen across multiple async functions and they are making requests to the same server.
I am open to any other suggestions as well as refactoring as long as it avoids nesting callbacks or third-party dependency

You can use the lodash throttle function https://lodash.com/docs/4.17.15#throttle to wrap around your side effect function. It will at most call it once per interval and memoize the last returned result (so you may want to return data instead of a stateful object such as an http body).

Related

Would using setTimeout in a serverless function cause any issue to me?

Since the JavaScript main thread runs infinitely until all callstack goes empty, I am using setTimeout to create a session on demand and instead of sending expiration time, I am using setTimeout to pop the session after every N minutes.
Now I am deploying my backend in Node using serverless cloud functions (In Vercel, to be more specific). I wanted to know if it is advisable to use setTimeout since it will keep re-running main thread until the timeout completes and the callback executes in a serverless function or should I completely discard this approach?
Here's how my implementation looks like:
const ttl = 3600 * 1000;
const addSession = async (ua: string): Promise<string> => {
const snapshot = await ref.push({ ua });
setTimeout(() => {
if (snapshot.key) ref.child(snapshot.key).remove();
}, ttl);
if (!snapshot.key) return addSession(ua);
return snapshot.key;
}

trying to understand async / await / sync in node

i know this probably has been asked before, but coming from single-threaded language for the past 20 years, i am really struggling to grasp the true nature of node. believe me, i have read a bunch of SO posts, github discussions, and articles about this.
i think i understand that each function has it's own thread type of deal. however, there are some cases where i want my code to be fully synchronous (fire one function after the other).
for example, i made 3 functions which seem to show me how node's async i/o works:
function sleepA() {
setTimeout(() => {
console.log('sleep A')
}, 3000)
}
function sleepB() {
setTimeout(() => {
console.log('sleep B')
}, 2000)
}
function sleepC() {
setTimeout(() => {
console.log('sleep C')
}, 1000)
}
sleepA()
sleepB()
sleepC()
this outputs the following:
sleep C
sleep B
sleep A
ok so that makes sense. node is firing all of the functions at the same time, and whichever ones complete first get logged to the console. kind of like watching horses race. node fires a gun and the functions take off, and the fastest one wins.
so now how would i go about making these synchronous so that the order is always A, B, C, regardless of the setTimeout number?
i've dabbled with things like bluebird and the util node library and am still kind of confused. i think this async logic is insanely powerful despite my inability to truly grasp it. php has destroyed my mind.
how would i go about making these synchronous so that the order is always A, B, C
You've confirmed that what you mean by that is that you don't want to start the timeout for B until A has finished (etc.).
Here in 2021, the easiest way to do that is to use a promise-enabled wrapper for setTimeout, like this one:
const delay = (ms) => new Promise(resolve => setTimeout(resolve, ms));
Then, make sleepA and such async functions:
async function sleepA() {
await delay(3000);
console.log("sleep A");
}
async function sleepB() {
await delay(2000);
console.log("sleep B");
}
async function sleepC() {
await delay(1000);
console.log("sleep C");
}
Then your code can await the promise each of those functions returns, which prevents the code from continuing until the promise is settled:
await sleepA();
await sleepB();
await sleepC();
You can do that inside an async function or (in modern versions of Node.js) at the top level of a module. You can't use await in a synchronous function, since await is asynchronous by nature.
i think i understand that each function has it's own thread type of deal
No, JavaScript functions are just like those of other synchronous languages (despite its reputation, JavaScript itself is overwhelmingly synchronous; it's just used in highly-asynchronous environments and recently got some features to make that easier). More on this below.
ok so that makes sense. node is firing all of the functions at the same time, and whichever ones complete first get logged to the console. kind of like watching horses race. node fires a gun and the functions take off, and the fastest one wins.
Close. The functions run in order, but all that each of them does is set up a timer and return. They don't wait for the timer to fire. Later, when it does fire, it calls the callback you passed into setTimeout.
JavaScript has a main thread and everything runs on that main thread unless you explicitly make it run on a worker thread or in a child process. Node.js is highly asynchronous (as are web browsers, the other place JavaScript is mostly used), but your JavaScript code all runs on one thread by default. There's a loop that processes "jobs." Jobs get queued by events (such as a timer firing), and then processed in order by the event loop. I go into a bit more on it here, and the Node.js documentation covers their loop here (I think it's slightly out of date, but the principal hasn't changed).

Can I use setTimeout() on JavaScript as a parallel processed function

I know JS is single threaded. But I have a function which takes time for the calculation. I would like it to work paralleled, so this function would not freeze next statement. Calculation inside of function will take approximately 1-2 seconds.
I used to create it using promise, but it still freeze the next statement.
console.log("start");
new Promise((res, rej) => {
/* calculations */
}).then((res) => console.log(res));
console.log("end");
Then I used setTimeout function with time interval 0. LoL
console.log("start");
setTimeout(() => {
/* calculations */
console.log(res);
}, 0);
console.log("end");
Outputs:
start
end
"calculation result"
Both cases shows similar result, but using promise prevented to show console.log("end") before calculation finishes. Using setTimeout works as I wanted, and shows console.log("end") before calculation, so it was not freeze till calculation done.
I hope it was clear enough. Now for me using setTimeout is the best solution, but I would be happy to hear your ideas or any other method calculating concurrently without setTimeout.
The code you write under new Promise(() => {..code here..}) is not asynchronous. The is a very common misconception that everything under the Promise block would run asynchronously.
Instead, this JS API just let's get us a hook of some deferred task to be done once the promise is resolved. MDN
Promises are a comparatively new feature of the JavaScript language that allow you to defer further actions until after a previous action
has completed, or respond to its failure. This is useful for setting
up a sequence of async operations to work correctly.
new Promise(() => {
// whatever I write here is synchromous
// like console.log, function call, setTimeout()/fetch()/async web apis
// if there are some async tasks like fetch, setTimeout.
// they can be async by themselves but their invocation is still sync
//
})
setTimeout is not the correct option either. Code under setTimeout would run when the event stack is empty and once it enters, it would block the main thread again.
The right approach to this would be to use Web Workers.

Running an operation periodically when the length of operation is not known

I need to execute an operation that needs to be executed relatively fast (let's say 10 times per second. It should be fast enough, but I can sacrifice speed if there are issues.) This is an ajax request, so potentially I do not know how much time it takes - it could even take seconds if network is bad.
The usual:
setInterval(() => operation(), 100);
Will not work here, because if the network is bad and my operation takes more than 100 ms, It might be scheduled one after another, occupying JS engine time (please correct me if I'm wrong)
The other possible solution is to recursively run it:
function execute() {
operation();
setTimeout(execute, 100);
}
This means that there will be 100 ms between the calls to operation(), which is OK for me. The problem with this is that I'm afraid that it will fail at some point because of stack overflow. Consider this code:
i = 0;
function test() { if (i % 1000 == 0) console.log(i); i++; test(); }
If I run it my console, this fails in around 12000 calls. if I add setTimeout in the end, this would mean 12000 / 10 / 60 = 20 minutes, potentially ruining the user experience.
Are there any simple ways how to do this and be sure it can run for days?
There's no "recursion" in asynchronous JavaScript. The synchronous code (the test function) fails because each call occupies some space in the call stack, and when it reaches the maximum size, further function calls throw an error.
However, asynchrony goes beyond the stack: when you call setTimeout, for example, it queues its callback in the event loop and returns immediately. Then, the code, that called it can return as well, and so on until the call stack is empty. setTimeout fires only after that.
The code queued by setTimeout then repeats the process, so no calls accumulate in the call stack.
Therefore, "recursive" setTimeout is a good solution to your problem.
Check this example (I recommend you to open it in fullscreen mode or watch it in the browser console):
Synchronous example:
function synchronousRecursion(i){
if(i % 5000 === 0) console.log('synchronous', i)
synchronousRecursion(i+1);
//The function cannot continue or return, waiting for the recursive call
//Further code won't be executed because of the error
console.log('This will never be evaluated')
}
try{
synchronousRecursion(1)
}catch(e){
console.error('Note that the stack accumuates (contains the function many times)', e.stack)
}
/* Just to make console fill the available space */
.as-console-wrapper{max-height: 100% !important;}
Asynchronous example:
function asynchronousRecursion(i){
console.log('asynchronous',i)
console.log('Note that the stack does not accumuate (always contains a single item)', new Error('Stack trace:').stack)
setTimeout(asynchronousRecursion, 100, i+1);
//setTimeout returns immediately, so code may continue
console.log('This will be evaluated before the timeout fires')
//<-- asynchronusRecursion can return here
}
asynchronousRecursion(1)
/* Just to make console fill the available space */
.as-console-wrapper{max-height: 100% !important;}
The two alternatives you showed here actually share the flaw you're concerned about, which is that the callbacks might bunch up and run together. Using setTimeout like this is (for your purposes) identical to calling setInterval (except for some small subtleties that don't apply with a light call like making an AJAX request.)
It sounds like you might want to guarantee that the callbacks run in order, or potentially that if multiple callbacks come in at once, that only the most recent one is run.
To build a service that runs the most recent callback, consider a setup like this:
let lastCallbackOriginTime = 0;
setInterval(()=>{
const now = new Date().getTime();
fetch(url).then(x=>x.json()).then(res=>{
if ( now > lastCallbackOriginTime ) {
// do interesting stuff
lastCallbackOriginTime = now;
}
else console.log('Already ran a more recent callback');
});
}, 100);
Or let's make it run the callbacks in order. To do this, just make each callback depend on a promise returned by the previous one.
let promises = [Promise.resolve(true)];
setInterval(()=>{
const promise = new Promise((resolve, reject)=> {
fetch(url).then(x=>x.json()).then(serviceResponse=>{
const lastPromise = promises[promises.length - 1];
lastPromise.then(()=>resolve(serviceResponse));
}).then((serviceResponse)=>{
// Your actual callback code
});
promises.push(promise)
});
}, 100);

avoiding simultaneous execution of shell command with node.js & shelljs

Using nodejs 8.12 on Gnu/Linux CentOS 7. Using the built-in web server, require('https'), for a simple application.
I understand that nodejs is single threaded (single process) and there is no actual parallel execution of code. Based on my understanding, I think the http/https server will process one http request and run the handler through all synchronous statements and set up asynchronous statements to be executed later before it will return to process a subsequent request. However, with http/https libraries, you have an asynchronous code that is used to assemble the body of the request. So, we already have one callback which is executed when the body is ready ('end' event). This fact makes me think it might be possible to be in the middle of processing two or more requests simultaneously.
As part of handling the requests, I need to execute a string of shell commands and I use the shelljs.exec library to do that. It runs synchronously, waiting until complete before returning. So, example code would look like:
const shelljs_exec = require('shelljs.exec');
function process() {
// bunch of shell commands in string
var command_str = 'command1; command2; command3';
var exec_results = shelljs_exec(command_str);
console.log('just executed shelljs_exec command');
var proc_results = process_results(exec_results);
console.log(proc_results);
// and return the response...
}
So node.js runs the shelljs_exec() and waits for completion. While it's waiting, can another request be worked on, such that there is a risk, slight, of two or more shelljs.exec invocations running at the same time? Since that could be a problem, I need to ensure only one shelljs.exec statement can be in progress at a given time.
If that is not a correct understanding, then I was thinking I need to do something with mutex locks. Like this:
const shelljs_exec = require('shelljs.exec');
const locks = require('locks');
// Need this in global scope - so we are all dealing with the same one.
var mutex = locks.createMutex();
function await_lock(shell_commands) {
var commands = shell_commands;
return new Promise(getting_lock => {
mutex.lock(got_lock_and_execute);
});
function got_lock_and_execute() {
var exec_results = shelljs_exec(commands);
console.log('just executed shelljs_exec command');
mutex.unlock();
return exec_results;
}
}
async function process() {
// bunch of shell commands in string
var command_str = 'command1; command2; command3';
exec_results = await await_lock(command_str);
var proc_results = process_results(exec_results);
console.log(proc_results);
}
If shelljs_exec is synchronous, no need.
If it is not. If it takes a callback wrap it in a Promise constructor so that it can be awaited. I would suggest properly wrapping the mutex.lock in a promise that gets resolved when the lock is acquired. The try finally is needed to ensure that the mutex is unlocked if shelljs_exec throws an exception.
async function await_lock(shell_commands) {
await (new Promise(function(resolve, reject) {
mutex.lock(resolve);
}));
try {
let exec_results = await shelljs_exec(commands);
return exec_results;
} finally {
mutex.unlock();
}
}
Untested. But it looks like it should work.

Categories

Resources