Hi my question above is a little vague so I'll try to make it clearer.
I have code of the form:
function main () {
async function recursive () {
var a = "Hello World";
var b = "Goodbye World";
recursive();
}
recursive();
}
I am running into an issue where I am running out of heap memory.
Assuming what I showed above is how my program behaves, with a and b being declared within the recursive function, my question is whether the variables are destroyed upon the call to recursive within the recursive function or whether they will persist until there are no more recursive calls and the main function reaches its endpoint assuming I keep the main function running long enough for that to happen.
I'm worried they stay alive in the heap and since my real program stores large strings in those variables I'm worried that that is the reason I'm running out of heap.
JavaScript does not (yet?) have tail call optimization for recursion, so for sure you'll eventually fill up your call stack. The reason your heap is filling up is because recursive is defined as an async function and never waits for the allocated Promise objects to resolve. This will fill up your heap because the garbage collector doesn't have the chance to collect them.
In short, recursion doesn't use up the heap unless you're allocating things in the heap within your functions.
Expansion of what's happening:
main
new Promise(() => {
new Promise(() => {
new Promise(() => {
new Promise(() => {
new Promise(() => { //etc.
You say you're not awaiting Promises because you want things to run in parallel. This is a dangerous approach for a few reasons. Promises should always be awaited in case they throw, and of course you want to make sure that you're allocating predictable amounts of memory usage. If this recursive function means you're getting an unpredictable number of needed tasks, you should consider a task queue pattern:
const tasks = [];
async function recursive() {
// ...
tasks.push(workItem);
// ...
}
async function processTasks() {
while (tasks.length) {
workItem = tasks.unshift();
// Process work item. May provoke calls to `recursive`
// ...do awaits here for the async parts
}
}
async function processWork(maxParallelism) {
const workers = [];
for (let i = 0; i < maxParallelism; i++) {
workers.push(processTasks());
}
await Promise.all(workers);
}
Finally, in case it needs to be said, async functions don't let you do parallel processing; they just provide a way to more conveniently write Promises and do sequential and/or parallel waits for async events. JavaScript remains a single-threaded execution engine unless you use things like web workers. So using async functions to try to parallelize synchronous tasks won't do anything for you.
Related
I have an asynchronous function that runs in a loop. It should run indefinitely unless one of the inner async functions rejects or the application is terminated. It should not impact the rest of the application.
import { setTimeout } from 'timers/promises';
let executing = true;
const example = async () => {
while (executing) {
await setTimeout(1000);
await someOtherFunc();
await yetAnotherFunc();
}
}
In my code I execute it without awaiting (since that blocks further execution). Later, I stop the loop by changing the value.
const main = async () => {
// don't await
example();
// ... do stuff 'in parallel' to the asynchronous loop
// possibly weeks later, elsewhere in the code
executing = false;
}
main().then();
(Emphasis that the examples are contrived for Stack Overflow.)
It works, but it feels like a bit of a hack. Testing the execution/stop logic is tricky because the loop logic executes without the remaining test logic waiting. I need to mock the contents of the loop (the various async functions) for testing but because the test continues to execute without awaiting, the test later resets the mocks and the inner loop logic then tries to make unmocked calls.
What is the idiomatic way of achieving this in javascript / nodejs? I feel like a callback would allow me to keep using promises but I can't quite figure out how to achieve it.
// don't await
There's the problem. Your main function definitely should wait until the loop has completed before returning (fulfilling its promise) to its caller.
You can achieve that by
async function main() {
const promise = example();
// ... do stuff
executing = false;
await promise;
}
or better (with proper error handling)
async function stuff() {
// ... do stuff
executing = false;
}
async function main() {
await Promise.all([example(), stuff()];
}
Btw you probably shouldn't have a global executing variable. Better have one per call to example(), so that you can stop each running loop individually. Pass a mutable token object as an argument for that.
I'm trying to do something that involves a global context that knows about what function is running at the moment.
This is easy with single-threaded synchronous functions, they start off running and finish running when they return.
But async functions can pop to the bottom of the program and climb back up multiple times before completing.
let currentlyRunningStack: string[] = [];
function run(id: string, cb: () => any) {
currentlyRunningStack.push(id);
cb()
currentlyRunningStack.shift();
}
// works with synchronous stuff
run("foo", () => {
run("bar", () => console.log("bar should be running"));
console.log("now foo is running");
});
// can it work with asynchronous
run("qux", async () => {
// async functions never run immediately...
await somePromise();
// and they start and stop a lot
});
Is it possible to keep track of whether or not an asynchronous function is currently running or currently waiting on something?
EDIT: there appears to be something similar called Zone.js. Used by Angular I guess.
Something like async_hooks for the browser?
EDIT: per #Bergi's suggestion, the word "stack" has been updated to "program" for clarification
It is possible, and someone has done it -- the angular developers, no less -- but it costs a whopping 5.41Mb!!
https://www.npmjs.com/package/zone.js
This was ripped from this very similar question: Something like async_hooks for the browser?
In order to distinguish from that question a bit, I'll answer the core query here:
Yes, you can tell when an async function stops and starts. You can see a lot of what's required in the project code
In particular, this file appears to handle poly-filling promises, though I'd need more time to verify this is where the magic happens. Perhaps with some effort I can distill this into something simpler to understand, that doesn't require 5.41 Mb to acheive.
Yes, it possible.
Using a running context, like a mutex, provided by Edgar W. Djiskistra, stack queues, Promise states and Promise.all executor. This way you can keep track if there's a running function in the program. You will have to implement a garbage collector, keeping the mutex list clean and will need a timer(setTimeout) to verify if the context is clean. When the context is clean, you will call a callback-like function to end you program, like process.exit(0). By context we refeer to the entire program order of execution.
Transforming the function into a promise with an .then callback to pop/clean the stack of the mutex after the execution of content of the function with a try/catch block to throw, handle, or log errors add more control to the hole program.
The introduction of setTimeout propicies a state machine, combined with the mutex/lock and introduces a memory leak that you will need to keep track of the timer to clear the memory allocated by each function.
This is done by neste try/catch. The use of setInterval for it introduces a memory leak that will cause a buffer overflow.
The timer will do the end of the program and there's it. You can keep track if a function is running or not and have every function registered running in a syncrhonous manner using await with and mutex.
Running the program/interpreter in a syncrhonous way avoid the memory leaks and race conditions, and work well. Some code example below.
const async run (fn) => {
const functionContextExecutionStackLength = functionExecutionStackLength + 1
const checkMutexStackQueue = () => {
if (mutexStack[0] instanceof Promise) {
if (mutex[0].state == "fullfiled") {
mutexStack = mutexStack.split(1, mutexStack.length)
runner.clear()
runner()
}
}
if (mutexStack.length == 0) process.exit(0)
}
// clear function Exection Context
const stackCleaner = setTimeout(1000, (err, callback) => {
if (functionContextExecutionStackLength == 10) {
runner.clear()
}
})
stackCleaner = stackCleaner()
// avoid memory leak on function execution context
if (functionContextExecutionStackLength == 10) {
stackCleaner.clear()
stackCleaner()
}
// the runner
const runner = setTimeout(1, async (err, callback) => {
// run syncronous
const append = (fn) => mutex.append(util.promisfy(fn)
.then(appendFunctionExectionContextTimes)
.then(checkMutexStackQueue))
// tranaform into promise with callback
const fn1 = () => util.promify(fn)
const fn2 = () => util.promisfy(append)
const orderOfExecution = [fn1, fn2, fn]
// a for await version can be considered
for (let i = 0; i < orderOfExecution.length; i++) {
if (orderOfExecution.length == index) {
orderOfExecution[orderOfExecution.length]()
} else {
try {
orderOfExecution[i]()
} catch (err) {
throw err
console.error(err)
}
}
}
}
}
(() => run())(fn)
On the code above we take the assynchronous caracterisc of javascript very seriously. Avoiding it when necessary and using it when is needed.
Obs:
Some variables was ommited but this is a demo code.
Sometimes you will see a variables context switching and call before execution, this is due to the es modules characteriscts of reading it all and interpreting it later.
TLDR: must I use async and await through a complicated call stack if most functions are not actually async? Are there alternative programming patterns?
Context
This question is probably more about design patterns and overall software architecture than a specific syntax issue. I am writing an algorithm in node.js that is fairly involved. The program flow involves an initial async call to fetch some data, and then moves on to a series of synchronous calculation steps based on the data. The calculation steps are iterative, and as they iterate, they generate results. But occasionally, if certain conditions are met by the calculations, more data will need to be fetched. Here is a simplified version in diagram form:
The calcStep loop runs thousands of times sychronously, pushing to the results. But occasionally it kicks back to getData, and the program must wait for more data to come in before continuing on again to the calcStep loop.
In code
A boiled-down version of the above might look like this in JS code:
let data;
async function init() {
data = await getData();
processData();
calcStep1();
}
function calcStep1() {
// do some calculations
calcStep2();
}
function calcStep2() {
// do more calculations
calcStep3();
}
function calcStep3() {
// do more calculations
pushToResults();
if (some_condition) {
getData(); // <------ question is here
}
if (stop_condition) {
finish();
} else {
calcStep1();
}
}
Where pushToResults and finish are also simple synchronous functions. I write the calcStep functions here are separate because in the real code, they are actually class methods from classes defined based on separation of concerns.
The problem
The obvious problem arises that if some_condition is met, and I need to wait to get more data before continuing the calcStep loop, I need to use the await keyword before calling getData in calcStep3, which means calcStep3 must be called async, and we must await that as well in calcStep2, and all the way up the chain, even synchronous functions must be labeled async and awaited.
In this simplified example, it would not be so offensive to do that. But in reality, my algorithm is far more complicated, with a much deeper call stack involving many class methods, iterations, etc. Is there a better way to manage awaiting functions in this type of scenario? Other tools I can use, like generators, or event emitters? I'm open to simple solutions or paradigm shifts.
If you don't want to make the function async and propagate this up the chain, use .then(). You'll need to duplicate the following code inside .then(); you can simplify this by putting it in its own function.
function maybeRepeat() {
if (stop_condition) {
finish();
} else {
calcStep1();
}
}
function calcStep3() {
// do more calculations
pushToResults();
if (some_condition) {
getData().then(maybeRepeat);
} else {
maybeRepeat()
}
}
This is the function that I have:
let counter = 0;
let dbConnected = false;
async function notASingleton(params) {
if (!dbConnected) {
await new Promise(resolve => {
if (Math.random() > 0.75) throw new Error();
setTimeout((params) => {
dbConnected = true; // assume we use params to connect to DB
resolve();
}, 1000);
});
return counter++
}
};
// in another module that imports notASingleton
Promise.all([notASingleton(params), notASingleton(params), notASingleton(params), notASingleton(params)]);
or
// in another module that imports notASingleton
notASingleton(params);
notASingleton(params);
notASingleton(params);
notASingleton(params);
The problem is that apparently the notASinglton promises in might be executed concurrently and assuming they are run in parallel, the execution context for all of them will be dbConnected = false.
Note: I'm aware that we could introduce a new variable e.g. initiatingDbConnection and instead of checking for !dbConnected check for !initiatingDbConnection; however, as long as concurrently means that the context of the promises will be the same inside Promise.all, that will not change anything.
The pattern can be properly implemented in e.g. Java by utilizing the contracts of JVM for creating a class: https://stackoverflow.com/a/16106598/12144949
However, even that Java implementation cannot be used for my use case where I need to pass a variable: "The client application can’t pass any argument, so we can’t reuse it. For example, having a generic singleton class for database connection where client application supplies database server properties."
https://www.journaldev.com/171/thread-safety-in-java-singleton-classes-with-example-code
Note 2: Another possibly related issue: https://eslint.org/docs/rules/require-atomic-updates#rule-details
"On some computers, they may be executed in parallel, or in some sense concurrently, while on others they may be executed serially."
That MDN description is rubbish. I'll remove it. Promise.all is not responsible for executing anything, and promises cannot be "executed" anyway. All it does is to wait for its arguments, and you're the one who are creating those promises in the array. They would run (and concurrently open multiple connections) even if you omitted Promise.all and simply called notASingleton() multiple times.
the execution context for all of them will be dbConnected = false
Yes, but only because your dbConnected = true; is in a timeout and you are calling notASingleton() again before that happens. Not because Promise.all does anything special.
This is a post that might come across as quite conceptual, since I first start with a lot of pseudo code. - At the end you'll see the use case for this problem, though a solution would be a "tool I can add to my tool-belt of useful programming techniques".
The problem
Sometimes one might need to create multiple promises, and either do something after all promises have ended. Or one might create multiple promises, based on the results of the previous promises. The analogy can be made to creating an array of values instead of a single value.
There are two basic cases to be considered, where the number of promises is indepedented of the result of said promises, and the case where it is depedent. Simple pseudo code of what "could" be done.
for (let i=0; i<10; i++) {
promise(...)
.then(...)
.catch(...);
}.then(new function(result) {
//All promises finished execute this code now.
})
The basically creates n (10) promises, and the final code would be executed after all promises are done. Of course the syntax isn't working in javascript, but it shows the idea. This problem is relativelly easy, and could be called completely asynchronous.
Now the second problem is like:
while (continueFn()) {
promise(...)
.then(.. potentially changing outcome of continueFn ..)
.catch(.. potentially changing outcome of continueFn ..)
}.then(new function(result) {
//All promises finished execute this code now.
})
This is much more complex, as one can't just start all promises and then wait for them to finish: in the end you'll have to go "promise-by-promise". This second case is what I wish to figure out (if one can do the second case you can also do the first).
The (bad) solution
I do have a working "solution". This is not a good solution as can probably quickly be seen, after the code I'll talk about why I dislike this method. Basically instead of looping it uses recursion - so the "promise" (or a wrapper around a promise which is a promise) calls itself when it's fulfilled, in code:
function promiseFunction(state_obj) {
return new Promise((resolve, reject) => {
//initialize fields here
let InnerFn = (stateObj) => {
if (!stateObj.checkContinue()) {
return resolve(state_obj);
}
ActualPromise(...)
.then(new function(result) {
newState = stateObj.cloneMe(); //we'll have to clone to prevent asynchronous write problems
newState.changeStateBasedOnResult(result);
return InnerFn(newState);
})
.catch(new function(err) {
return reject(err); //forward error handling (must be done manually?)
});
}
InnerFn(initialState); //kickstart
});
}
Important to note is that the stateObj should not change during its lifetime, but it can be really easy. In my real problem (which I'll explain at the end) the stateObj was simply a counter (number), and the if (!stateObj.checkContinue()) was simply if (counter < maxNumber).
Now this solution is really bad; It is ugly, complicated, error prone and finally impossible to scale.
Ugly because the actual business logic is buried in a mess of code. It doesn't show "on the can" that is actually simply doing what the while loop above does.
Complicated because the flow of execution is impossible to follow. First of all recursive code is never "easy" to follow, but more importantly you also have to keep in mind thread safety with the state-object. (Which might also have a reference to another object to, say, store a list of results for later processing).
It's error prone since there is more redundancy than strictly necessary; You'll have to explicitly forward the rejection. Debugging tools such as a stack trace also quickly become really hard to look through.
The scalability is also a problem at some points: this is a recursive function, so at one point it will create a stackoverflow/encounter maximum recursive depth. Normally one could either optimize by tail recursion or, more common, create a virtual stack (on the heap) and transform the function to a loop using the manual stack. In this case, however, one can't change the recursive calls to a loop-with-manual-stack; simply because of how promise syntax works.
The alternative (bad) solution
A colleague suggested an alternative approach to this problem, something that initially looked much less problematic, but I discarded ultimatelly since it was against everything promises are meant to do.
What he suggested was basically looping over the promises as per above. But instead of letting the loop continue there would be a variable "finished" and an inner loop that constantly checks for this variable; so in code it would be like:
function promiseFunction(state_obj) {
return new Promise((resolve, reject) => {
while (stateObj.checkContinue()) {
let finished = false;
let err = false;
let res = null;
actualPromise(...)
.then(new function(result) {
res = result;
finished = true;
})
.catch(new function(err) {
res = err;
err = true;
finished = true;
});
while(!finished) {
sleep(100); //to not burn our cpu
}
if (err) {
return reject(err);
}
stateObj.changeStateBasedOnResult(result);
}
});
}
While this is less complicated, since it's now easy to follow the flow of execution. This has problems of its own: not for the least that it's unclear when this function will end; and it's really bad for performance.
Conclusion
Well this isn't much to conclude yet, I'd really like something as simple as in the first pseudo code above. Maybe another way of looking at things so that one doesn't have the trouble of deeply recursive functions.
So how would you rewrite a promise that is part of a loop?
The real problem used as motivation
Now this problem has roots in a real thing I had to create. While this problem is now solved (by applying the recursive method above), it might be interesting to know what spawned this; The real question however isn't about this specific case, but rather on how to do this in general with any promise.
In a sails app I had to check a database, which had orders with order-ids. I had to find the first N "non existing order-ids". My solution was to get the "first" M products from the database, find the missing numbers within it. Then if the number of missing numbers was less than N get the next batch of M products.
Now to get an item from a database, one uses a promise (or callback), thus the code won't wait for the database data to return. - So I'm basically at the "second problem:"
function GenerateEmptySpots(maxNum) {
return new Promise((resolve, reject) => {
//initialize fields
let InnerFn = (counter, r) => {
if (r > 0) {
return resolve(true);
}
let query = {
orderNr: {'>=': counter, '<': (counter + maxNum)}
};
Order.find({
where: query,
sort: 'orderNr ASC'})
.then(new function(result) {
n = findNumberOfMissingSpotsAndStoreThemInThis();
return InnerFn(newState, r - n);
}.bind(this))
.catch(new function(err) {
return reject(err);
});
}
InnerFn(maxNum); //kickstart
});
}
EDIT:
Small post scriptus: the sleep function in the alternative is just from another library which provided a non-blocking-sleep. (not that it matters).
Also, should've indicated I'm using es2015.
The alternative (bad) solution
…doesn't actually work, as there is no sleep function in JavaScript. (If you have a runtime library which provides a non-blocking-sleep, you could just have used a while loop and non-blocking-wait for the promise inside it using the same style).
The bad solution is ugly, complicated, error prone and finally impossible to scale.
Nope. The recursive approach is indeed the proper way to do this.
Ugly because the actual business logic is buried in a mess of code. And error-prone as you'll have to explicitly forward the rejection.
This is just caused by the Promise constructor antipattern! Avoid it.
Complicated because the flow of execution is impossible to follow. Recursive code is never "easy" to follow
I'll challenge that statement. You just have to get accustomed to it.
You also have to keep in mind thread safety with the state-object.
No. There is no multi-threading and shared memory access in JavaScript, if you worry about concurrency where other stuff affects your state object while the loop runs that will a problem with any approach.
The scalability is also a problem at some points: this is a recursive function, so at one point it will create a stackoverflow
No. It's asynchronous! The callback will run on a new stack, it's not actually called recursively during the function call and does not carry those stack frames around. The asynchronous event loop already provides the trampoline to make this tail-recursive.
The good solution
function promiseFunction(state) {
const initialState = state.cloneMe(); // clone once for this run
// initialize fields here
return (function recurse(localState) {
if (!localState.checkContinue())
return Promise.resolve(localState);
else
return actualPromise(…).then(result =>
recurse(localState.changeStateBasedOnResult(result))
);
}(initialState)); // kickstart
}
The modern solution
You know, async/await is available in every environment that implemented ES6, as all of them also implemented ES8 now!
async function promiseFunction(state) {
const localState = state.cloneMe(); // clone once for this run
// initialize fields here
while (!localState.checkContinue()) {
const result = await actualPromise(…);
localState = localState.changeStateBasedOnResult(result);
}
return localState;
}
Let’s begin with the simple case: You have N promises that all do some work, and you want to do something when all the promises have finished. There’s actually a built-in way to do exactly that: Promise.all. With that, the code will look like this:
let promises = [];
for (let i=0; i<10; i++) {
promises.push(doSomethingAsynchronously());
}
Promise.all(promises).then(arrayOfResults => {
// all promises finished
});
Now, the second call is a situation you encounter all the time when you want to continue doing something asynchronously depending on the previous asynchronous result. A common example (that’s a bit less abstract) would be to simply fetch pages until you hit the end.
With modern JavaScript, there’s luckily a way to write this in a really readable way: Using asynchronous functions and await:
async function readFromAllPages() {
let shouldContinue = true;
let pageId = 0;
let items = [];
while (shouldContinue) {
// fetch the next page
let result = await fetchSinglePage(pageId);
// store items
items.push.apply(items, result.items);
// evaluate whether we want to continue
if (!result.items.length) {
shouldContinue = false;
}
pageId++;
}
return items;
}
readFromAllPages().then(allItems => {
// items have been read from all pages
});
Without async/await, this will look a bit more complicated, since you need to manage all this yourself. But unless you try to make it super generic, it shouldn’t look that bad. For example, the paging one could look like this:
function readFromAllPages() {
let items = [];
function readNextPage(pageId) {
return fetchSinglePage(pageId).then(result => {
items.push.apply(items, result.items);
if (!result.items.length) {
return Promise.resolve(null);
}
return readNextPage(pageId + 1);
});
}
return readNextPage(0).then(() => items);
}
First of all recursive code is never "easy" to follow
I think the code is fine to read. As I’ve said: Unless you try to make it super generic, you can really keep it simple. And naming also helps a lot.
but more importantly you also have to keep in mind thread safety with the state-object
No, JavaScript is single-threaded. You doing things asynchronously but that does not necessarily mean that things are happening at the same time. JavaScript uses an event loop to work off asynchronous processes, where only one code block runs at a single time.
The scalability is also a problem at some points: this is a recursive function, so at one point it will create a stackoverflow/encounter maximum recursive depth.
Also no. This is recursive in the sense that the function references itself. But it will not call itself directly. Instead it will register itself as a callback when an asynchronous process finishes. So the current execution of the function will finish first, then at some point the asynchronous process finishes, and then the callback will eventually run. These are (at least) three separate steps from the event loop, which all run independently from another, so you do no have a problem with recursion depth here.
The crux of the matter seems to be that "the actual business logic is buried in a mess of code".
Yes it is ... in both solutions.
Things can be separated out by :
having an asyncRecursor function that simply knows how to (asynchronously) recurse.
allowing the recursor's caller(s) to specify the business logic (the terminal test to apply, and the work to be performed).
It is also better to allow caller(s) to be responsible for cloning the original object rather than resolver() assuming cloning always to be necessary. The caller really needs to be in charge in this regard.
function asyncRecursor(subject, testFn, workFn) {
// asyncRecursor orchestrates the recursion
if(testFn(subject)) {
return Promise.resolve(workFn(subject)).then(result => asyncRecursor(result, testFn, workFn));
// the `Promise.resolve()` wrapper safeguards against workFn() not being thenable.
} else {
return Promise.resolve(subject);
// the `Promise.resolve()` wrapper safeguards against `testFn(subject)` failing at the first call of asyncRecursor().
}
}
Now you can write your caller as follows :
// example caller
function someBusinessOrientedCallerFn(state_obj) {
// ... preamble ...
return asyncRecursor(
state_obj, // or state_obj.cloneMe() if necessary
(obj) => obj.checkContinue(), // testFn
(obj) => somethingAsync(...).then((result) => { // workFn
obj.changeStateBasedOnResult(result);
return obj; // return `obj` or anything you like providing it makes a valid parameter to be passed to `testFn()` and `workFn()` at next recursion.
});
);
}
You could theoretically incorporate your terminal test inside the workFn but keeping them separate will help enforce the discipline, in writers of the business-logic, to remember to include a test. Otherwise they will consider it optional and sure as you like, they will leave it out!
Sorry, this doesn't use Promises, but sometimes abstractions just get in the way.
This example, which builds from #poke's answer, is short and easy to comprehend.
function readFromAllPages(done=function(){}, pageId=0, res=[]) {
fetchSinglePage(pageId, res => {
if (res.items.length) {
readFromAllPages(done, ++pageId, items.concat(res.items));
} else {
done(items);
}
});
}
readFromAllPages(allItems => {
// items have been read from all pages
});
This has only a single depth of nested functions. In general, you can solve the nested callback problem without resorting to a subsystem that manages things for you.
If we drop the parameter defaults and change the arrow functions, we get code that runs in legacy ES3 browsers.