Is using async and await the crude person's threads?
Many moons ago I learned how to do multithreaded Java code on Android. I recall I had to create threads, start threads, etc.
Now I'm learning Javascript, and I just learned about async and await.
For example:
async function isThisLikeTwoThreads() {
const a = slowFunction();
const b = fastFunction();
console.log(await a, await b);
}
This looks way simpler than what I used to do and is a lot more intuitive.
slowFunction() would start first, then fastFunction() would start, and console.log() would wait until both functions resolved before logging - and slowFunction() and fastFunction() are potentially running at the same time. I expect it's ultimatly on the browser whether or not thesea are seperate threads. But it looks like it walks and talks like crude multithreading. Is it?
Is using async and await the crude person's threads?
No, not at all. It's just syntax sugar (really, really useful sugar) over using promises, which in turn is just a (really, really useful) formalized way to use callbacks. It's useful because you can wait asynchronously (without blocking the JavaScript main thread) for things that are, by nature, asynchronous (like HTTP requests).
If you need to use threads, use web workers, Node.js worker threads, or whatever multi-threading your environment provides. Per specification (nowadays), only a single thread at a time is allowed to work within a given JavaScript "realm" (very loosely: the global environment your code is running in and its associated objects, etc.) and so only a single thread at a time has access to the variables and such within that realm, but threads can cooperate via messaging (including transferring objects between them without making copies) and shared memory.
For example:
async function isThisLikeTwoThreads() {
const a = slowFunction();
const b = fastFunction();
console.log(await a, await b);
}
Here's what that code does when isThisLikeTwoThreads is called:
slowFunction is called synchronously and its return value is assigned to a.
fastFunction is called synchronously and its return value is assigned to b.
When isThisLikeTwoThreads reaches await a, it wraps a in a promise (as though you did Promise.resolve(a)) and returns a new promise (not that same one). Let's call the promise wrapped around a "aPromise" and the promise returned by the function "functionPromise".
Later, when aPromise settles, if it was rejected functionPromise is rejected with the same rejection reason and the following steps are skipped; if it was fulfilled, the next step is done
The code in isThisLikeTwoThreads continues by wrapping b in a promise (bPromise) and waiting for that to settle
When bPromise settles, if it was rejected functionPromise is rejected with the same rejection reason; if it was fulfilled, the code in isThisLikeTwoThreads continues by logging the fulfillment values of aPromise and bPromise and then fulfilling functionPromise with the value undefined
All of the work above was done on the JavaScript thread where the call to isThisLikeTwoThreads was done, but it was spread out across multiple "jobs" (JavaScript terminology; the HTML spec calls them "tasks" and specifies a fair bit of detail for how they're handled on browsers). If slowFunction or fastFunction started an asynchronous process and returned a promise for that, that asynchronous process (for instance, an HTTP call the browser does) may have continued in parallel with the JavaScript thread while the JavaScript thread was doing other stuff or (if it was also JavaScript code on the main thread) may have competed for other work on the JavaScript thread (competed by adding jobs to the job queue and the thread processing them in a loop).
But using promises doesn't add threading. :-)
I suggest you read this to understand that the answer is no: https://developer.mozilla.org/en-US/docs/Web/JavaScript/EventLoop
To summarize, the runtime uses multiple threads for internal operations (network, disk... for a Node.js environment, rendering, indexedDB, network... for a browser environment) but the JavaScript code you wrote and the one you imported from different libraries will always execute in a single thread. Asynchronous operations will trigger callbacks which will be queued and executed one by one.
Basically what happens when executing this function:
async function isThisLikeTwoThreads() {
const a = slowFunction();
const b = fastFunction();
console.log(await a, await b);
}
Execute slowFunction.
Execute fastFunction.
Enqueue the rest of the code (console.log(await a, await b)) when a promise and b promise have resolved. The console.log runs in the same thread after isThisLikeTwoThreads has returned, and after the possible enqueued callbacks have returned. Assuming slowFunction and fastFunction both return promises, this is an equivalent of:
function isThisLikeTwoThreads() {
const a = slowFunction();
const b = fastFunction();
a.then(aResult => b.then(bResult => console.log(aResult, bResult)));
}
Similar but not the same. Javascript will only ever be doing 'one' thing at a time. It is fundamentally single-threaded. That said, it can appear odd as results may appear to be arriving in different orders - exacerbated by functions such as async and await.
For example, even if you're rendering a background process at 70fps there are small gaps in rendering logic available to complete promises or receive event notifications -- it's during these moments promises are completed giving the illusion of multi-threading.
If you REALLY want to lock up a browser try this:
let a = 1;
while (a == 1){
console.log("doing a thing");
}
You will never get javascript to work again and chrome or whatever will murder your script. The reason is when it enters that loop - nothing will touch it again, no events will be rendered and no promises will be delivered - hence, single threading.
If this was a true multithreaded environment you could break that loop from the outside by changing the variable to a new value.
Related
While debugging a set-up to control promise concurrency for some CPU intensive asynchronous tasks, I came across the following behavior that I can not understand.
My route to control the flow was to first create an array of Promises, and after that control the execution of these Promises by explicitly awaiting them. I produced the following snippet that shows the unexpected behavior:
import fs from 'node:fs';
import { setTimeout } from 'node:timers/promises';
async function innerAsync(n: number) {
return fs.promises.readdir('.').then(value => {
console.log(`Executed ${n}:`, value);
return value;
});
}
async function controllerAsync() {
const deferredPromises = new Array<Promise<string[]>>();
for (const n of [1, 2, 3]) {
const defer = innerAsync(n);
deferredPromises.push(defer);
}
const result = deferredPromises[0];
return result;
}
const result = await controllerAsync();
console.log('Resolved:', result);
await setTimeout(1000);
This snippet produces the following result (assuming 2 files are in .):
Executed 1: [ 'test.txt', 'test2.txt' ]
Resolved: [ 'test.txt', 'test2.txt' ]
Executed 2: [ 'test.txt', 'test2.txt' ]
Executed 3: [ 'test.txt', 'test2.txt' ]
Observed
The promises defined at line 15 are all being executed, regardless if they are returned/awaited (2 and 3).
Question
What causes the other, seemingly unused, promises to be executed? How can I make sure in this snippet only the first Promise is executed, where the others stay pending?
Node v19.2
Typescript v4.9.4
StackBlitz
What causes the other, seemingly unused, promises to be executed?
A promise is a tool used to watch something and provide an API to react to the something being complete.
Your innerAsync function calls fs.promises.readdir. Calling fs.promises.readdir makes something happen (reading a directory).
The then callback is called as a reaction when reading that directory is complete.
(It isn't clear why you marked innerAsync as async and then didn't use await inside it.)
If you don't use then or await, then that doesn't change the fact you have called fs.promises.readdir!
How can I make sure in this snippet only the first Promise is executed, where the others stay pending?
Focus on the function which does the thing, not the code which handles it being complete.
Or an analogy:
If you tell Alice to go to the shops and get milk, but don't tell Bob to put the milk in the fridge when Alice gets back, then you shouldn't be surprised that Alice has gone to the shops and come back with milk.
Promises aren't "executed" at all. A promise is a way to observe an asynchronous process that is already in progress. By the time you have a promise, the process it's reporting the result of is already underway.¹ It's the call to fs.readdir that starts the process, not calling then on a promise. (And even if it were, you are calling then on the promise from fs.readdir right away, in your innerAsync function.)
If you want to wait to start the operations, wait to call the methods starting the operations and giving you the promises. I'd show an edited version of your code, but it's not clear to me what it's supposed to do, particularly since controllerAsync only looks at the first element of the array and doesn't wait for anything to complete before returning.
A couple of other notes:
It's almost never useful to combine async/await with explicit calls to .then as in innerAsync. That function would be more clearly written as:
async function innerAsync(n: number) {
const value = await fs.promises.readdir(".");
console.log(`Executed ${n}:`, value);
return value;
}
From an order-of-operations perspective, that does exactly the same thing as the version with the explicit then.
There's no purpose served by declaring controllerAsync as an async function, since it never uses await. The only reason for making a function async is so you can use await within it.
¹ This is true in the normal case (the vast majority of cases), including the case of the promises you get from Node.js' fs/promises methods. There are (or were), unfortunately, a couple of libraries that had functions that returned promises but didn't start the actual work until the promise's then method was called. But those were niche outliers and arguably violate the semantics of promises.
i know this probably has been asked before, but coming from single-threaded language for the past 20 years, i am really struggling to grasp the true nature of node. believe me, i have read a bunch of SO posts, github discussions, and articles about this.
i think i understand that each function has it's own thread type of deal. however, there are some cases where i want my code to be fully synchronous (fire one function after the other).
for example, i made 3 functions which seem to show me how node's async i/o works:
function sleepA() {
setTimeout(() => {
console.log('sleep A')
}, 3000)
}
function sleepB() {
setTimeout(() => {
console.log('sleep B')
}, 2000)
}
function sleepC() {
setTimeout(() => {
console.log('sleep C')
}, 1000)
}
sleepA()
sleepB()
sleepC()
this outputs the following:
sleep C
sleep B
sleep A
ok so that makes sense. node is firing all of the functions at the same time, and whichever ones complete first get logged to the console. kind of like watching horses race. node fires a gun and the functions take off, and the fastest one wins.
so now how would i go about making these synchronous so that the order is always A, B, C, regardless of the setTimeout number?
i've dabbled with things like bluebird and the util node library and am still kind of confused. i think this async logic is insanely powerful despite my inability to truly grasp it. php has destroyed my mind.
how would i go about making these synchronous so that the order is always A, B, C
You've confirmed that what you mean by that is that you don't want to start the timeout for B until A has finished (etc.).
Here in 2021, the easiest way to do that is to use a promise-enabled wrapper for setTimeout, like this one:
const delay = (ms) => new Promise(resolve => setTimeout(resolve, ms));
Then, make sleepA and such async functions:
async function sleepA() {
await delay(3000);
console.log("sleep A");
}
async function sleepB() {
await delay(2000);
console.log("sleep B");
}
async function sleepC() {
await delay(1000);
console.log("sleep C");
}
Then your code can await the promise each of those functions returns, which prevents the code from continuing until the promise is settled:
await sleepA();
await sleepB();
await sleepC();
You can do that inside an async function or (in modern versions of Node.js) at the top level of a module. You can't use await in a synchronous function, since await is asynchronous by nature.
i think i understand that each function has it's own thread type of deal
No, JavaScript functions are just like those of other synchronous languages (despite its reputation, JavaScript itself is overwhelmingly synchronous; it's just used in highly-asynchronous environments and recently got some features to make that easier). More on this below.
ok so that makes sense. node is firing all of the functions at the same time, and whichever ones complete first get logged to the console. kind of like watching horses race. node fires a gun and the functions take off, and the fastest one wins.
Close. The functions run in order, but all that each of them does is set up a timer and return. They don't wait for the timer to fire. Later, when it does fire, it calls the callback you passed into setTimeout.
JavaScript has a main thread and everything runs on that main thread unless you explicitly make it run on a worker thread or in a child process. Node.js is highly asynchronous (as are web browsers, the other place JavaScript is mostly used), but your JavaScript code all runs on one thread by default. There's a loop that processes "jobs." Jobs get queued by events (such as a timer firing), and then processed in order by the event loop. I go into a bit more on it here, and the Node.js documentation covers their loop here (I think it's slightly out of date, but the principal hasn't changed).
I've read many articles on promises and async/await. They all seem to deal with working with APIs that return promises and how to work with those promises. I'm still a little confused.
Lets say I have a function that contains three (3) loops. Hypothetically, each loop does some work that takes five (5) seconds to complete, pushing the results of each iteration into an array for that loop. This work happens synchronously (loop 1 runs, followed by loop 2 and then loop 3).
The function then does something with the results of all three loops before returning. The total hypothetical execution time is around sixteen (16) seconds.
If the loops were put into their own functions, wrapped in a promise and then awaited inside an async function using await Promise.all (before doing something with the results), would they be executed in parallel (because they are in the event loop, not the call stack) and once all three (3) promises resolved, the function would continue? Is this faster than the whole processes being synchronous?
I guess I am confused as to when/why you'd want to create your own promises from synchronous JS?
function loop1() {
return new Promise((resolve, reject) => {
let loop1Counter = 0
const loop1Array = []
while (loop1Counter < 10 **8) {
// Do some work
loop1Array.push(resultOfWork)
}
loop1Counter += 1
resolve(loop1Array)
}
}
async function test() {
const promisesToAwait = [loop1(), loop2(), loop3()]
const results = await Promise.all(promisesToAwait)
// Do some work with results
return something
}
JavaScript execution is single threaded. This means that without using a mechanism such as a WebWorker none of the javascript code will execute in parallel. With that said there are a few reasons for executing synchronous javascript with async code such as promises.
Your code takes a while to execute.
Javascript execution shares the same thread as with the UI in browsers. This means that if your code takes 5 seconds to execute the users browser will be blocked for the duration. You could instead break up the execution into multiple asynchronously executed blocks which will allow the UI to remain responsive.
You are participating in a promise chain from existing async functions.
It is sometimes necessary to intermix async and synch code. Promises allow you to compose both async and synch code into an execution chain without having to have hard coded dependencies.
You are running into stack size limitations.
Depending on how your code and libraries are written sometimes you may have run away stacks which can lead to stack size exceptions or just excessive memory usage. By executing a piece of code asynchronously it gains its own new stack.
You are executing code in a library/framework that expects your code to be executing asynchronously.
You can translate synch code into async code but not the other way around. Therefore some libraries that execute code you have written(as a delegate) may require your code to be async to support the async use cases. Some libraries attempt to be intelligent about this and adapt based on your return type but not all of them are this smart.
I have an async function that runs by a setInterval somewhere in my code. This function updates some cache in regular intervals.
I also have a different, synchronous function which needs to retrieve values - preferably from the cache, yet if it's a cache-miss, then from the data origins
(I realize making IO operations in a synchronous manner is ill-advised, but lets assume this is required in this case).
My problem is I'd like the synchronous function to be able to wait for a value from the async one, but it's not possible to use the await keyword inside a non-async function:
function syncFunc(key) {
if (!(key in cache)) {
await updateCacheForKey([key]);
}
}
async function updateCacheForKey(keys) {
// updates cache for given keys
...
}
Now, this can be easily circumvented by extracting the logic inside updateCacheForKey into a new synchronous function, and calling this new function from both existing functions.
My question is why absolutely prevent this use case in the first place? My only guess is that it has to do with "idiot-proofing", since in most cases, waiting on an async function from a synchronous one is wrong. But am I wrong to think it has its valid use cases at times?
(I think this is possible in C# as well by using Task.Wait, though I might be confusing things here).
My problem is I'd like the synchronous function to be able to wait for a value from the async one...
They can't, because:
JavaScript works on the basis of a "job queue" processed by a thread, where jobs have run-to-completion semantics, and
JavaScript doesn't really have asynchronous functions — even async functions are, under the covers, synchronous functions that return promises (details below)
The job queue (event loop) is conceptually quite simple: When something needs to be done (the initial execution of a script, an event handler callback, etc.), that work is put in the job queue. The thread servicing that job queue picks up the next pending job, runs it to completion, and then goes back for the next one. (It's more complicated than that, of course, but that's sufficient for our purposes.) So when a function gets called, it's called as part of the processing of a job, and jobs are always processed to completion before the next job can run.
Running to completion means that if the job called a function, that function has to return before the job is done. Jobs don't get suspended in the middle while the thread runs off to do something else. This makes code dramatically simpler to write correctly and reason about than if jobs could get suspended in the middle while something else happens. (Again it's more complicated than that, but again that's sufficient for our purposes here.)
So far so good. What's this about not really having asynchronous functions?!
Although we talk about "synchronous" vs. "asynchronous" functions, and even have an async keyword we can apply to functions, a function call is always synchronous in JavaScript. An async function is a function that synchronously returns a promise that the function's logic fulfills or rejects later, queuing callbacks the environment will call later.
Let's assume updateCacheForKey looks something like this:
async function updateCacheForKey(key) {
const value = await fetch(/*...*/);
cache[key] = value;
return value;
}
What that's really doing, under the covers, is (very roughly, not literally) this:
function updateCacheForKey(key) {
return fetch(/*...*/).then(result => {
const value = result;
cache[key] = value;
return value;
});
}
(I go into more detail on this in Chapter 9 of my recent book, JavaScript: The New Toys.)
It asks the browser to start the process of fetching the data, and registers a callback with it (via then) for the browser to call when the data comes back, and then it exits, returning the promise from then. The data isn't fetched yet, but updateCacheForKey is done. It has returned. It did its work synchronously.
Later, when the fetch completes, the browser queues a job to call that promise callback; when that job is picked up from the queue, the callback gets called, and its return value is used to resolve the promise then returned.
My question is why absolutely prevent this use case in the first place?
Let's see what that would look like:
The thread picks up a job and that job involves calling syncFunc, which calls updateCacheForKey. updateCacheForKey asks the browser to fetch the resource and returns its promise. Through the magic of this non-async await, we synchronously wait for that promise to be resolved, holding up the job.
At some point, the browser's network code finishes retrieving the resource and queues a job to call the promise callback we registered in updateCacheForKey.
Nothing happens, ever again. :-)
...because jobs have run-to-completion semantics, and the thread isn't allowed to pick up the next job until it completes the previous one. The thread isn't allowed to suspend the job that called syncFunc in the middle so it can go process the job that would resolve the promise.
That seems arbitrary, but again, the reason for it is that it makes it dramatically easier to write correct code and reason about what the code is doing.
But it does mean that a "synchronous" function can't wait for an "asynchronous" function to complete.
There's a lot of hand-waving of details and such above. If you want to get into the nitty-gritty of it, you can delve into the spec. Pack lots of provisions and warm clothes, you'll be some time. :-)
Jobs and Job Queues
Execution Contexts
Realms and Agents
You can call an async function from within a non-async function via an Immediately Invoked Function Expression (IIFE):
(async () => await updateCacheForKey([key]))();
And as applied to your example:
function syncFunc(key) {
if (!(key in cache)) {
(async () => await updateCacheForKey([key]))();
}
}
async function updateCacheForKey(keys) {
// updates cache for given keys
...
}
This shows how a function can be both sync and async, and how the Immediately Invoked Function Expression idiom is only immediate if the path through the function being called does synchronous things.
function test() {
console.log('Test before');
(async () => await print(0.3))();
console.log('Test between');
(async () => await print(0.7))();
console.log('Test after');
}
async function print(v) {
if(v<0.5)await sleep(5000);
else console.log('No sleep')
console.log(`Printing ${v}`);
}
function sleep(ms : number) {
return new Promise(resolve => setTimeout(resolve, ms));
}
test();
(Based off of Ayyappa's code in a comment to another answer.)
The console.log looks like this:
16:53:00.804 Test before
16:53:00.804 Test between
16:53:00.804 No sleep
16:53:00.805 Printing 0.7
16:53:00.805 Test after
16:53:05.805 Printing 0.3
If you change the 0.7 to 0.4 everything runs async:
17:05:14.185 Test before
17:05:14.186 Test between
17:05:14.186 Test after
17:05:19.186 Printing 0.3
17:05:19.187 Printing 0.4
And if you change both numbers to be over 0.5, everything runs sync, and no promises get created at all:
17:06:56.504 Test before
17:06:56.504 No sleep
17:06:56.505 Printing 0.6
17:06:56.505 Test between
17:06:56.505 No sleep
17:06:56.505 Printing 0.7
17:06:56.505 Test after
This does suggest an answer to the original question, though. You could have a function like this (disclaimer: untested nodeJS code):
const cache = {}
async getData(key, forceSync){
if(cache.hasOwnProperty(key))return cache[key] //Runs sync
if(forceSync){ //Runs sync
const value = fs.readFileSync(`${key}.txt`)
cache[key] = value
return value
}
//If we reach here, the code will run async
const value = await fsPromises.readFile(`${key}.txt`)
cache[key] = value
return value
}
Now, this can be easily circumvented by extracting the logic inside updateCacheForKey into a new synchronous function, and calling this new function from both existing functions.
T.J. Crowder explains the semantics of async functions in JavaScript perfectly. But in my opinion the paragraph above deserves more discussion. Depending on what updateCacheForKey does, it may not be possible to extract its logic into a synchronous function because, in JavaScript, some things can only be done asynchronously. For example there is no way to perform a network request and wait for its response synchronously. If updateCacheForKey relies on a server response, it can't be turned into a synchronous function.
It was true even before the advent of asynchronous functions and promises: XMLHttpRequest, for instance, gets a callback and calls it when the response is ready. There's no way of obtaining a response synchronously. Promises are just an abstraction layer on callbacks and asynchronous functions are just an abstraction layer on promises.
Now this could have been done differently. And it is in some environments:
In PHP, pretty much everything is synchronous. You send a request with curl and your script blocks until it gets a response.
Node.js has synchronous versions of its file system calls (readFileSync, writeFileSync etc.) which block until the operation completes.
Even plain old browser JavaScript has alert and friends (confirm, prompt) which block until the user dismisses the modal dialog.
This demonstrates that the designers of the JavaScript language could have opted for synchronous versions of XMLHttpRequest, fetch etc. Why didn't they?
[W]hy absolutely prevent this use case in the first place?
This is a design decision.
alert, for instance, prevents the user from interacting with the rest of the page because JavaScript is single threaded and the one and only thread of execution is blocked until the alert call completes. Therefore there's no way to execute event handlers, which means no way to become interactive. If there was a syncFetch function, it would block the user from doing anything until the network request completes, which can potentially take minutes, even hours or days.
This is clearly against the nature of the interactive environment we call the "web". alert was a mistake in retrospect and it should not be used except under very few circumstances.
The only alternative would be to allow multithreading in JavaScript which is notoriously difficult to write correct programs with. Are you having trouble wrapping your head around asynchronous functions? Try semaphores!
It is possible to add a good old .then() to the async function and it will work.
Should consider though instead of doing that, changing your current regular function to async one, and all the way up the call stack until returned promise is not needed, i.e. there's no work to be done with the value returned from async function. In which case it actually CAN be called from a synchronous one.
Consider small helper functions, some of which are obvious candidates for async / promise (as I think I understand them):
exports.function processFile(file){
return new Promise(resolve => {
// super long processing of file
resolve(file);
});
}
While this:
exports.function addOne(number){
return new Promise(resolve => {
resolve(number + 1);
});
}
Seems overkill.
What is the rule by which one determines whether they should promise-fy their functions?
I generally use a promise if I need to execute multiple async functions in a certain order. For example, if I need to make three requests and wait for them all to come back before I can combine the data and process them, that's a GREAT use case for promises. In your case, let's say you have a file, and you need to wait until the file and metadata from a service are both loaded, you could put them in a promise.
If you're looking to optimize operations on long running processes, I'd look more into web workers than using promises. They operate outside the UI thread and use messages to pass their progress back into the UI thread. It's event based, but no need for a promise. The only caveat there is that you can't perform actions on the DOM inside a web worker.
More on web workers here: https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers
What is the rule by which one determines whether they should promise-fy their functions?
It's quite simple actually:
When the function does something asynchronous, it should return a promise for the result
When the function does only synchronous things, it should not return a promise