Why couldn't popular JavaScript runtimes handle synchronous-looking asynchronous script? - javascript

As cowboy says down in the comments here, we all want to "write [non-blocking JavaScript] asynchronous code in a style similar to this:
try
{
var foo = getSomething(); // async call that would normally block
var bar = doSomething(foo);
console.log(bar);
}
catch (error)
{
console.error(error);
}
"
So people have come up solutions to this problem like
callback libraries (eg async)
promises
event patterns
streamline
domains and
generators.
But none of these lead to code as simple and easy to understand as the sync-style code above.
So why isn't possible for javascript compilers/interpreters to just NOT block on the statements we currently know as "blocking"? So why isn't possible for javascript compilers/interpreters to handle the sync syntax above AS IF we'd written it in an async style?"
For example, upon processing getSomething() above, the compiler/interpreter could just say "this statement is a call to [file system/network resource/...], so I'll make a note to listen to responses from that call and in the meantime get on with whatever's in my event loop". When the call returns, execution can proceed to doSomething().
You would still maintain all of the basic features of popular JavaScript runtime environments
single threaded
event loop
blocking operations (I/O, network, wait timers) handled "asynchronously"
This would be simply a tweak to the syntax, that would allow the interpreter to pause execution on any given bit of code whenever IT DETECTS an async operation, and instead of needing callbacks, code just continues from the line after the async call when the call returns.
As Jeremy says
there is nothing in the JavaScript runtime that will preemptively
pause the execution of a given task, permit some other code to execute
for a while, and then resume the original task
Why not? (As in, "why couldn't there be?"... I'm not interested in a history lesson)
Why does a developer have to care about whether a statement is blocking or not? Computers are for automating stuff that humans are bad at (eg writing non-blocking code).
You could perhaps implement it with
a statement like "use noblock"; (a bit like "use strict";) to turn this "mode" on for a whole page of code. EDIT: "use noblock"; was a bad choice, and misled some answerers that I was trying to change the nature of common JavaScript runtimes altogether. Something like 'use syncsyntax'; might better describe it.
some kind of parallel(fn, fn, ...); statement allowing you to run things in parallel while in "use syncsyntax"; mode - eg to allow multiple async activities to be kicked off at once
EDIT: a simple sync-style syntax wait(), which would be used instead of setTimeout() in "use syncsyntax"; mode
EDIT:
As an example, instead of writing (standard callback version)
function fnInsertDB(myString, fnNextTask) {
fnDAL('insert into tbl (field) values (' + myString + ');', function(recordID) {
fnNextTask(recordID);
});
}
fnInsertDB('stuff', fnDeleteDB);
You could write
'use syncsyntax';
function fnInsertDB(myString) {
return fnDAL('insert into tbl (field) values (' + myString ');'); // returns recordID
}
var recordID = fnInsertDB('stuff');
fnDeleteDB(recordID);
The syncsyntax version would process exactly the same way as the standard version, but it's much easier to understand what the programmer intended (as long as you understand that syncsyntax pauses execution on this code as discussed).

So why isn't possible for javascript compilers/interpreters to just NOT block on the statements we currently know as "blocking"?
Because of concurrency control. We want them to block, so that (in JavaScript's single-threaded nature) we are safe from race conditions that alter the state of our function while we still are executing it. We must not have an interpreter that suspends the execution of the current function at any arbitrary statement/expression and resumes with some different part of the program.
Example:
function Bank() {
this.savings = 0;
}
Bank.prototype.transfer = function(howMuch) {
var savings = this.savings;
this.savings = savings + +howMuch(); // we expect `howMuch()` to be blocking
}
Synchronous code:
var bank = new Bank();
setTimeout(function() {
bank.transfer(prompt); // Enter 5
alert(bank.savings); // 5
}, 0);
setTimeout(function() {
bank.transfer(prompt); // Enter 3
alert(bank.savings); // 8
}, 100);
Asynchronous, arbitrarily non-blocking code:
function guiPrompt() {
"use noblock";
// open form
// wait for user input
// close form
return input;
}
var bank = new Bank();
setTimeout(function() {
bank.transfer(guiPrompt); // Enter 5
alert(bank.savings); // 5
}, 0);
setTimeout(function() {
bank.transfer(guiPrompt); // Enter 3
alert(bank.savings); // 3 // WTF?!
}, 100);
See https://glyph.twistedmatrix.com/2014/02/unyielding.html for a longer (and language-agnostic) explanation.
there is nothing in the JavaScript runtime that will preemptively pause the execution of a given task, permit some other code to execute for a while, and then resume the original task
Why not?
For simplicity and security, see above. (And, for the history lesson: That's how it just was done)
However, this is no longer true. With ES6 generators, there is something that lets you explicitly pause execution of the current function generator: the yield keyword.
As the language evolves, there are also async and await keywords planned for ES7.
generators [… don't …] lead to code as simple and easy to understand as the sync code above.
But they do! It's even right in that article:
suspend(function* () {
// ^ "use noblock" - this "function" doesn't run continuously
try {
var foo = yield getSomething();
// ^^^^^ async call that does not block the thread
var bar = doSomething(foo);
console.log(bar);
} catch (error) {
console.error(error);
}
})
There is also a very good article on this subject here: http://howtonode.org/generators-vs-fibers

Why not? No reason, it just hadn't been done.
And here in 2017, it has been done in ES2017: async functions can use await to wait, non-blocking, for the result of a promise. You can write your code like this if getSomething returns a promise (note the await) and if this is inside an async function:
try
{
var foo = await getSomething();
var bar = doSomething(foo);
console.log(bar);
}
catch (error)
{
console.error(error);
}
(I've assumed there that you only intended getSomething to be asynchronous, but they both could be.)
Live Example (requires up-to-date browser like recent Chrome):
function getSomething() {
return new Promise((resolve, reject) => {
setTimeout(() => {
if (Math.random() < 0.5) {
reject(new Error("failed"));
} else {
resolve(Math.floor(Math.random() * 100));
}
}, 200);
});
}
function doSomething(x) {
return x * 2;
}
(async () => {
try
{
var foo = await getSomething();
console.log("foo:", foo);
var bar = doSomething(foo);
console.log("bar:", bar);
}
catch (error)
{
console.error(error);
}
})();
The first promise fails half the time, so click Run repeatedly to see both failure and success.
You've tagged your question with NodeJS. If you wrap the Node API in promises (for instance, with promisify), you can write nice straight-forward synchronous-looking code that runs asynchronously.

Because Javascript interpreters are single-threaded, event driven. This is how the initial language was developed.
You can't do "use noblock" because no other work can occur during that phase. This means your UI will not update. You cannot respond to mouse or other input event from the user. You cannot redraw the screen. Nothing.
So you want to know why? Because javascript can cause the display to change. If you were able to do both simultaneously you'd have all these horrible race conditions with your code and the display. You might think you've moved something on the screen, but it hasn't drawn, or it drew and you moved it after it drew and now it's gotta draw again, etc. This asynchronous nature allows, for any given event in the execution stack to have a known good state -- nothing is going to modify the data that is being used while this is being executed.
That is not to say what you want doesn't exist, in some form.
The async library allows you to do things like your parallel idea (amongst others).
Generators/async/wait will allow you to write code that LOOKS like what you want (although it'll be asynchronous by nature).
Although you are making a false claim here -- humans are NOT bad at writing asynchronous code.

The other answers talked about the problems multi-threading and parallelism introduce. However, I want to address your answer directly.
Why not? (As in, "why couldn't there be?"... I'm not interested in a history lesson)
Absolutely no reason. ECMAScript - the JavaScript specification says nothing about concurrency, it does not specify the order code runs in, it does not specify an event loop or events at all and it does not specify anything about blocking or not blocking.
The way concurrency works in JavaScript is defined by its host environment - in the browser for example that's the DOM and the DOM specifies the semantics of the event loop. "async" functions like setTimeout are only the concern of the DOM and not the JavaScript language.
Moreover there is nothing that says JavaScript runtimes have to run single threaded and so on. If you have sequential code the order of execution is specified, but there is nothing stopping anyone from embedding the JavaScript language in a multi threaded environment.

Related

Is there a way to return a value with async/await instead of a Promise ? As a synchronous function do

Firstly I am familiar with the concept of asynchronous/synchronous function.
There is also a lot of questions related to mine. But I can't find my answer anywhere.
So the question is:
Is there a way to return a value instead of a Promise using async/await ? As a synchronous function do.
For example:
async doStuff(param) {
return await new Promise((resolve, reject) => {
setTimeout(() => {
console.log('doStuff after a while.');
resolve('mystuffisdone'+param);
}, 2000);
});
}
console.log(doStuff('1'));
The only way to get the value of this function is by using the .then function.
doStuff('1').then(response => {
console.log(response); // output: mystuffisdone1
doOtherStuffWithMyResponse(response);
// ...
});
Now, what I want is:
const one = doStuff('1');
console.log(one) // mystuffisdone1
const two = doStuff('2');
console.log(two) // mystuffisdone2
To explain myself, I have an asynchronous library full of callbacks. I can turn this asynchronous behavior to a synchronous behavior by using Promises and async/await to faking a synchronous behavior.
But there is still a problem, it is still asynchronous in the end; outside of the scope of the async function.
doStuff('1').then((r) => {console.log(r)};
console.log('Hello wolrd');
It will result in: Hello world then mystuffisdone1. This is the expected behavior when using async/await functions. But that's not what I want.
Now my question would be: Is there a way to do the same thing as await do without the keyword async ? To make the code being synchronous ? And if not possible, why ?
Edit:
Thank you for all you answers, I think my question is not obsvious for all. To clear up what I think here is my comment to #Nikita Isaev answer.
"I understand why all I/O operations are asynchronously done; or done in parallel. But my question is more about the fact that why the engine doesn't block the caller of the sync function in an asynchronous manner ? I mean const a = doStuff(...) is a Promise. We need to call .then to get the result of this function. But why JavaScript or Node engine does not block the caller (just the block where the call is made). If this is possible, we could do const a = doStuff(...), wait and get the result in a without blocking the main thread. As async/await does, why there is no place for sync/wait ?"
Hope this is more clear now, feel free to comment or ask anything :)
Edit 2:
All precisions of the why of the answer are in the comments of the accepted answer.
There are some hacky ways to do what is desired, but that would be an anti-pattern. I’ll try to explain. Callbacks is one of the core concepts in javascript. When your code launches, you may set up event listeners, timers, etc. You just tell the engine to schedule some tasks: “when A happens, do B”. This is what asynchrony is. But callbacks are ugly and difficult to debug, that’s why promises and async-await were introduced. It is important to understand that this is just a syntax sugar, your code still is asynchronous when using async-await. As there are no threads in javascript, waiting for some events to fire or some complicated operations to finish in a synchronous way would block your entire application. The UI or the server would just stop responding to any other user interactions and would keep waiting for a single event to fire.
Real world cases:
Example 1.
Let’s say we have a web UI. We have a button that downloads the latest information from the server on click. Imagine we do it synchronously. What happens?
myButton.onclick = function () {
const data = loadSomeDataSync(); // 0
useDataSomehow(data);
}
Everything’s synchronous, the code is flat and we are happy. But the user is not.
A javascript process can only ever execute a single line of code in a particular moment. User will not be able to click other buttons, see any animations etc, the app is stuck waiting for loadSomeDataSync() to finish. Even if this lasts 3 seconds, it’s a terrible user experience, you can neither cancel nor see the progress nor do something else.
Example 2.
We have a node.js http server which has over 1 million users. For each user, we need to execute a heavy operation that lasts 5 seconds and return the result. We can do it in a synchronous or asynchronous manner. What happens if we do it in async?
User 1 connects
We start execution of heavy operation for user 1
User 2 connects
We return data for user 1
We start execution of heavy operation for user 2
…
I.e we do everything in parallel and asap. Now imagine we do the heavy operation in a sync manner.
User 1 connects
We start execution of heavy operation for user 1, everyone else is waiting for it to accomplish
We return data for user 1
User 2 connects
…
Now imagine the heavy operation takes 5 seconds to accomplish, and our server is under high load, it has over 1 million users. The last one will have to wait for nearly 5 million seconds, which is definitely not ok.
That’s why:
In browser and server API, most of the i/o operations are asynchronous
Developers strive to make all heavy calculation asynchronous, even React renders in an asynchronous manner.
No, going from promise to async/await will not get you from async code to sync code. Why? Because both are just different wrapping for the same thing. Async function returns immediately just like a promise does.
You would need to prevent the Event Loop from going to next call. Simple while(!isMyPromiseResolved){} will not work either because it will also block callback from promises so the isMyPromiseResolved flag will never be set.
BUT... There are ways to achieve what you have described without async/await. For example:
OPTION 1: using deasync approach. Example:
function runSync(value) {
let isDone = false;
let result = null;
runAsync(value)
.then(res => {
result = res;
isDone = true;
})
.catch(err => {
result = err;
isDone = true;
})
//magic happens here
require('deasync').loopWhile(function(){return !isDone;});
return result;
}
runAsync = (value) => {
return new Promise((resolve, reject) => {
setTimeout(() => {
// if passed value is 1 then it is a success
if(value == 1){
resolve('**success**');
}else if (value == 2){
reject('**error**');
}
}, 1000);
});
}
console.log('runSync(2): ', runSync(2));
console.log('runSync(1): ', runSync(1));
OR
OPTION 2: calling execFileSync('node yourScript.js') Example:
const {execFileSync} = require('child_process');
execFileSync('node',['yourScript.js']);
Both approaches will block the user thread so they should be used only for automation scripts or similar purposes.
Wrap the outer body in an asynchronous IIFE:
/**/(async()=>{
function doStuff(param) { // no need for this one to be async
return new Promise((resolve, reject) => { // just return the original promise
setTimeout(() => {
console.log('doStuff after a while.');
resolve('mystuffisdone'+param);
}, 2000);
});
}
console.log(await doStuff('1')); // and await instead of .then
/**/})().then(()=>{}).catch(e=>console.log(e))
The extra cleanup of the doStuff function isn't strictly necessary -- it works either way -- but I hope it helps clarify how async functions and Promises are related. The important part is to wrap the outer body into an async function to get the improved semantics throughout your program.
It's also not strictly necessary to have the final .then and .catch, but it's good practice. Otherwise, your errors might get swallowed, and any code ported to Node will whine about uncaught Promise rejections.

Using "await" inside non-async function

I have an async function that runs by a setInterval somewhere in my code. This function updates some cache in regular intervals.
I also have a different, synchronous function which needs to retrieve values - preferably from the cache, yet if it's a cache-miss, then from the data origins
(I realize making IO operations in a synchronous manner is ill-advised, but lets assume this is required in this case).
My problem is I'd like the synchronous function to be able to wait for a value from the async one, but it's not possible to use the await keyword inside a non-async function:
function syncFunc(key) {
if (!(key in cache)) {
await updateCacheForKey([key]);
}
}
async function updateCacheForKey(keys) {
// updates cache for given keys
...
}
Now, this can be easily circumvented by extracting the logic inside updateCacheForKey into a new synchronous function, and calling this new function from both existing functions.
My question is why absolutely prevent this use case in the first place? My only guess is that it has to do with "idiot-proofing", since in most cases, waiting on an async function from a synchronous one is wrong. But am I wrong to think it has its valid use cases at times?
(I think this is possible in C# as well by using Task.Wait, though I might be confusing things here).
My problem is I'd like the synchronous function to be able to wait for a value from the async one...
They can't, because:
JavaScript works on the basis of a "job queue" processed by a thread, where jobs have run-to-completion semantics, and
JavaScript doesn't really have asynchronous functions — even async functions are, under the covers, synchronous functions that return promises (details below)
The job queue (event loop) is conceptually quite simple: When something needs to be done (the initial execution of a script, an event handler callback, etc.), that work is put in the job queue. The thread servicing that job queue picks up the next pending job, runs it to completion, and then goes back for the next one. (It's more complicated than that, of course, but that's sufficient for our purposes.) So when a function gets called, it's called as part of the processing of a job, and jobs are always processed to completion before the next job can run.
Running to completion means that if the job called a function, that function has to return before the job is done. Jobs don't get suspended in the middle while the thread runs off to do something else. This makes code dramatically simpler to write correctly and reason about than if jobs could get suspended in the middle while something else happens. (Again it's more complicated than that, but again that's sufficient for our purposes here.)
So far so good. What's this about not really having asynchronous functions?!
Although we talk about "synchronous" vs. "asynchronous" functions, and even have an async keyword we can apply to functions, a function call is always synchronous in JavaScript. An async function is a function that synchronously returns a promise that the function's logic fulfills or rejects later, queuing callbacks the environment will call later.
Let's assume updateCacheForKey looks something like this:
async function updateCacheForKey(key) {
const value = await fetch(/*...*/);
cache[key] = value;
return value;
}
What that's really doing, under the covers, is (very roughly, not literally) this:
function updateCacheForKey(key) {
return fetch(/*...*/).then(result => {
const value = result;
cache[key] = value;
return value;
});
}
(I go into more detail on this in Chapter 9 of my recent book, JavaScript: The New Toys.)
It asks the browser to start the process of fetching the data, and registers a callback with it (via then) for the browser to call when the data comes back, and then it exits, returning the promise from then. The data isn't fetched yet, but updateCacheForKey is done. It has returned. It did its work synchronously.
Later, when the fetch completes, the browser queues a job to call that promise callback; when that job is picked up from the queue, the callback gets called, and its return value is used to resolve the promise then returned.
My question is why absolutely prevent this use case in the first place?
Let's see what that would look like:
The thread picks up a job and that job involves calling syncFunc, which calls updateCacheForKey. updateCacheForKey asks the browser to fetch the resource and returns its promise. Through the magic of this non-async await, we synchronously wait for that promise to be resolved, holding up the job.
At some point, the browser's network code finishes retrieving the resource and queues a job to call the promise callback we registered in updateCacheForKey.
Nothing happens, ever again. :-)
...because jobs have run-to-completion semantics, and the thread isn't allowed to pick up the next job until it completes the previous one. The thread isn't allowed to suspend the job that called syncFunc in the middle so it can go process the job that would resolve the promise.
That seems arbitrary, but again, the reason for it is that it makes it dramatically easier to write correct code and reason about what the code is doing.
But it does mean that a "synchronous" function can't wait for an "asynchronous" function to complete.
There's a lot of hand-waving of details and such above. If you want to get into the nitty-gritty of it, you can delve into the spec. Pack lots of provisions and warm clothes, you'll be some time. :-)
Jobs and Job Queues
Execution Contexts
Realms and Agents
You can call an async function from within a non-async function via an Immediately Invoked Function Expression (IIFE):
(async () => await updateCacheForKey([key]))();
And as applied to your example:
function syncFunc(key) {
if (!(key in cache)) {
(async () => await updateCacheForKey([key]))();
}
}
async function updateCacheForKey(keys) {
// updates cache for given keys
...
}
This shows how a function can be both sync and async, and how the Immediately Invoked Function Expression idiom is only immediate if the path through the function being called does synchronous things.
function test() {
console.log('Test before');
(async () => await print(0.3))();
console.log('Test between');
(async () => await print(0.7))();
console.log('Test after');
}
async function print(v) {
if(v<0.5)await sleep(5000);
else console.log('No sleep')
console.log(`Printing ${v}`);
}
function sleep(ms : number) {
return new Promise(resolve => setTimeout(resolve, ms));
}
test();
(Based off of Ayyappa's code in a comment to another answer.)
The console.log looks like this:
16:53:00.804 Test before
16:53:00.804 Test between
16:53:00.804 No sleep
16:53:00.805 Printing 0.7
16:53:00.805 Test after
16:53:05.805 Printing 0.3
If you change the 0.7 to 0.4 everything runs async:
17:05:14.185 Test before
17:05:14.186 Test between
17:05:14.186 Test after
17:05:19.186 Printing 0.3
17:05:19.187 Printing 0.4
And if you change both numbers to be over 0.5, everything runs sync, and no promises get created at all:
17:06:56.504 Test before
17:06:56.504 No sleep
17:06:56.505 Printing 0.6
17:06:56.505 Test between
17:06:56.505 No sleep
17:06:56.505 Printing 0.7
17:06:56.505 Test after
This does suggest an answer to the original question, though. You could have a function like this (disclaimer: untested nodeJS code):
const cache = {}
async getData(key, forceSync){
if(cache.hasOwnProperty(key))return cache[key] //Runs sync
if(forceSync){ //Runs sync
const value = fs.readFileSync(`${key}.txt`)
cache[key] = value
return value
}
//If we reach here, the code will run async
const value = await fsPromises.readFile(`${key}.txt`)
cache[key] = value
return value
}
Now, this can be easily circumvented by extracting the logic inside updateCacheForKey into a new synchronous function, and calling this new function from both existing functions.
T.J. Crowder explains the semantics of async functions in JavaScript perfectly. But in my opinion the paragraph above deserves more discussion. Depending on what updateCacheForKey does, it may not be possible to extract its logic into a synchronous function because, in JavaScript, some things can only be done asynchronously. For example there is no way to perform a network request and wait for its response synchronously. If updateCacheForKey relies on a server response, it can't be turned into a synchronous function.
It was true even before the advent of asynchronous functions and promises: XMLHttpRequest, for instance, gets a callback and calls it when the response is ready. There's no way of obtaining a response synchronously. Promises are just an abstraction layer on callbacks and asynchronous functions are just an abstraction layer on promises.
Now this could have been done differently. And it is in some environments:
In PHP, pretty much everything is synchronous. You send a request with curl and your script blocks until it gets a response.
Node.js has synchronous versions of its file system calls (readFileSync, writeFileSync etc.) which block until the operation completes.
Even plain old browser JavaScript has alert and friends (confirm, prompt) which block until the user dismisses the modal dialog.
This demonstrates that the designers of the JavaScript language could have opted for synchronous versions of XMLHttpRequest, fetch etc. Why didn't they?
[W]hy absolutely prevent this use case in the first place?
This is a design decision.
alert, for instance, prevents the user from interacting with the rest of the page because JavaScript is single threaded and the one and only thread of execution is blocked until the alert call completes. Therefore there's no way to execute event handlers, which means no way to become interactive. If there was a syncFetch function, it would block the user from doing anything until the network request completes, which can potentially take minutes, even hours or days.
This is clearly against the nature of the interactive environment we call the "web". alert was a mistake in retrospect and it should not be used except under very few circumstances.
The only alternative would be to allow multithreading in JavaScript which is notoriously difficult to write correct programs with. Are you having trouble wrapping your head around asynchronous functions? Try semaphores!
It is possible to add a good old .then() to the async function and it will work.
Should consider though instead of doing that, changing your current regular function to async one, and all the way up the call stack until returned promise is not needed, i.e. there's no work to be done with the value returned from async function. In which case it actually CAN be called from a synchronous one.

Wait for an async function to return in Node.js

Supposed, I have a async function in Node.js, basically something such as:
var addAsync = function (first, second, callback) {
setTimeout(function () {
callback(null, first + second);
}, 1 * 1000);
};
Now of course I can call this function in an asynchronous style:
addAsync(23, 42, function (err, result) {
console.log(result); // => 65
});
What I am wondering about is whether you can make it somehow to call this function synchronously. For that, I'd like to have a wrapper function sync, which basically does the following thing:
var sync = function (fn, params) {
var res,
finished = false;
fn.call(null, params[0], params[1], function (err, result) {
res = result;
finished = true;
});
while (!finished) {}
return res;
};
Then, I'd be able to run addAsync synchronously, by calling it this way:
var sum = sync(addAsync, [23, 42]);
Note: Of course you wouldn't work using params[0] and params[1] in reality, but use the arguments array accordingly, but I wanted to keep things simple in this example.
Now, the problem is, that the above code does not work. It just blocks, as the while loop blocks and does not release the event loop.
My question is: Is it possible in any way to make this sample run as intended?
I have already experimented with setImmediate and process.nextTick and various other things, but non of them helped. Basically, what you'd need was a way to tell Node.js to please pause the current function, continue running the event loop, and getting back at a later point in time.
I know that you can achieve something similar using yield and generator functions, at least in Node.js 0.11.2 and above. But, I'm curious whether it works even without?
Please note that I am fully aware of how to do asynchronous programming in Node.js, of the event loop and all the related stuff. I am also fully aware that writing code like this is a bad idea, especially in Node.js. And I am also fully aware that an 'active wait' is a stupid idea, as well. So please don't give the advice to learn how to do it asynchronously or something like that. I know that.
The reason why I am asking is just out of curiosity and for the wish to learn.
I've recently created simpler abstraction WaitFor to call async functions in sync mode (based on Fibers). It's at an early stage but works. Please try it: https://github.com/luciotato/waitfor
using WaitFor your code will be:
console.log ( wait.for ( addAsync,23,42 ) ) ;
You can call any standard nodejs async function, as if it were a sync function.
wait.for(fn,args...) fulfills the same need as the "sync" function in your example, but inside a Fiber (without blocking node's event loop)
You can use npm fibers (C++ AddOn project) & node-sync
implement a blocking call in C(++) and provide it as a library
Yes I know-you know - BUT EVER-EVER-EVER ;)
Non-Blocking) use a control flow library
Standard propmises and yield (generator functions) will make this straigthforward:
http://blog.alexmaccaw.com/how-yield-will-transform-node
http://jlongster.com/A-Study-on-Solving-Callbacks-with-JavaScript-Generators

Test Javascript Function's Blocking Behavior

In Javascript, I have two versions of a recursive function, one that runs synchronously and one that uses simple scheduling to run asynchronously. Given certain inputs, in both cases the function is expected to have an infinite execution path. I need to develop tests for these functions, specifically a test to check that the asynchronous version does not block the main thread.
I already have tests that check the output callback behavior of these functions in non-returning cases, I am only concerned about testing the blocking behavior. I can limit how long the function runs to some long but finite amount of time for testing purposes as well. I am currently using QUnit but can switch to another testing framework.
How can I test that a non-returning, asynchronous function does not block?
Edit, For Clarification
This would be a bare bones example of the function I am working with:
function a()
{
console.log("invoked");
setTimeout(a, 1000);
}
a();
I am intentionally misusing some threading terms in my description because I felt they most clearly expressed the problem. By not blocking the main thread, I mean that invoking the function does not prevent the scheduling and execution of other logic. I expect the function itself will be executed on the main thread but I consider the function running as long as it is scheduled for execution in the future.
Unit Test are based on single-responsability-principle and isolation (separate the subject under test from it's dependencies).
In this case, you expect your function to run asynchronously but this behaviour is not done by your function, is done by the "setTimeout" function, so I think you must isolate your function from "setTimeout" since it's a dependency you don't want to test, the browser guarantees you it will work.
Then, as we trust "setTimeout" will do the asyncrhonous logic, we can only test our function calls to "setTimeout" and we can do this replacing "window.setTimeout" with another function while we must always restore it after the test is complete.
function replaceSetTimeout() {
var originalSetTimeout = window.setTimeout;
var callCount = 0;
window.setTimeout = function() {
callCount++;
};
window.setTimeout.restore = function() {
window.setTimeout = originalSetTimeout;
};
window.setTimeout.getCallCount = function() {
return callCount;
};
}
replaceSetTimeout();
asyncFunction();
assert(setTimeout.getCallCount() === 1);
setTimeout.restore();
I recommend you to use sinon.js as it provides many tools like spies who are functions than will inform you about how many times and with what arguments where called.
var originalSetTimeout = window.setTimeout;
window.setTimeout = sinon.spy();
asyncFunction();
// check called only once
assert(setTimeout.calledOnce);
// check the first argument was asyncFunction
assert(setTimeout.calledWith(asyncFunction));
Sinon also provides fake timers who does the setTimeout substitution but with so much more features, like the .tick(x) method who will simulate "x" milliseconds but in this case I think it doesn't help you.
Update to answer question edit:
1 - Your function executes infinitely so you cannot test it without interrupting it's execution, so you must overwrite "setTimeout" somewhere.
2 - You want your function to execute recursively allowing other code to be executed between iterations? great! but understand than your function can not do this your function only can call setTimeout or setInterval and hope this function work as expected. You should test what your function does.
3 - You want to test from Javascript (a sandboxed environment) than another Javascript code uses and releases the only one execution thread (the same you are using to test). Do you really think this is an easy test?
4 - but the most important one - I don't like white box because it couples the test with the dependency, if you change your dependency or how it's called in the future you will have to change the test. This problem doesn't exist with DOM function, DOM functions will keep the same interface for years, and for now, you have no other way to do what you want than calling one of those two functions, so I don't think in this case "white box testing" is a bad idea.
I told you this because I had the same problem testing a Promise pattern implementation than had to be always asynchronous, even if the promise is already fulfilled, and I've tested it using test-engine asynchronous-test way (using callbacks and stuff) and it was a mess, test failing randomly, so much slow test execution. Then I asked a TDD expert how can test be so hard and he answered than I was not following Single Responsability Principle since I was trying to test my promise implementation AND the setTimeout behaviour.
If you think about it from a Behaviour Driven Testing perspective then 'Does my function block?' is not a useful question. It will definitely block, a better question might be 'does it return in no more than 50ms'.
You could do this with something like :
test( "speed test", function() {
var start = new Date();
a();
ok(new Date() - start < 50, "Passed!" );
});
The issue with this is that if someone does do something silly that makes your function block indefinitely the test won't fail, it will hang.
Because JavaScript is single threaded there is no way around this. If I come along and change your function to :
function a() {
while(true) {
console.log("invoked")
}
}
The test will hang.
You can make breaking things this way harder by refactoring things a little. There are 2 separate things being done. Your chunk of work and the scheduling. Separate these and you'll end up with something like the following functions :
function a() {
// doWork
var stopRunning = true;
return stopRunning;
}
function doAsync(workFunc, scheduleFunc, timeout) {
if (!workFunc()) {
scheduleFunc(doAsync, [workFunc, scheduleFunc, timeout], timeout);
}
}
function schedule(func, args, timeout) {
setTimeout(function() {func.apply(window, args);}, timeout);
}
Now you're free to test everything in isolation. You can supply a mock workFunc and scheduleFunc to a test for doAsync to verify it behaves as expected and you can test your function a() without worrying about how it is scheduled.
It's still possible for a dunce programmer to put an infinite loop into the function a(), but because they don't have to consider how to run further units of work it should be less likely.
To test or prove an infinitely executing execution path will never block is next to impossible, so you have to split your problem up into parts.
Your path is basically foo(foo(foo(foo(...etc...)))), nevermind that SetTimeout actually removes recursion. So all you have to do is test or prove that your foo does not block (I tell you now that testing will be "a bit" easier than proving, more below)
So, does function foo block?
Talking a bit maths, if you want to know whether f(f(...f(x)...)) always has a value, you actually only have to prove that f(x) always has a value for any x that f can return. It does not matter how many recursions you have, if you can make sure their return values are fine.
What that means for your foo is that you only have to prove that foo does not block for any possible input value. Keep in mind that in this case, all global variables and closures are input values too. This means you have to sanity-check every single value you are using on every call.
To test, of course you will have to replace SetTimeout, but that is trivial, and if you replace it with an empty function (function(){}) it is easy to prove that this function does not block or otherwise alter your execution. You will then
Making things easier
Taking in what I wrote above, this also means that you would have to make sure no global function or variable that you are ever using will ever be changed to a point that your function breaks to a point it breaks. This actually is quite hard, but you can still make things easier for you by making sure you always use the same functions and values and that other functions can not touch them by using closures.
function foo(n, setTimeout)
{
var x = global_var;
// sanity check n here
function f()
{
setTimeout(f, n)
}
return f();
}
This way, you only have to test those values on the first execution. It's nice to be able to assume Math.Pi is actually Pi and not a string value containing "noodles". Really nice.
Do not use global mutable objects
Call those you can not circumvent using setTimeout to ensure they can not block
If you need return values, things will get really tricky, but possible, consider this:
function() {
var x = 0;
setTimeout(function(){x = insecure();}, 1);
}
All you have to do is
Use x next iteration
Sanity check value of x first!
Does SetTimeout block?
Of course this depends on whether setTimeout blocks. This is quite hard to prove, but a bit easier to test. You can't actually prove it since it's implementation is up to the interpreter.
Personally I would assume that setTimeout behaves like an empty function when it's return value is discarded.
Performing this asynchronous testing is actually possible in QUnit but is handled better in another JavaScript testing framework, Jasmine JS. I'll provide examples in both.
In QUnit you need to first call the stop() function to signal that the test is expected to run asynchronously, you should then call setTimeout with a function that includes your expectations as well as a call to the start() function to complete the block. Here's an example:
test( "a test", function() {
stop();
asyncOp();
setTimeout(function() {
equals( asyncOp.result, "someExpectedValue" );
start();
}, 150 );
});
Edit: Apparently there's also a whole asyncTest construct that you can use that simplifies this process. Take a look: http://api.qunitjs.com/asyncTest/
In Jasmine (http://pivotal.github.com/jasmine/), a Behavior Driven Development (BDD) testing framework, there are built-in methods for writing asynchronous tests. Here's an example of an asynchronous test in Jasmine:
describe('Some module', function() {
it('should run asynchronously', function() {
var isDone = false;
runs(function() {
// The first call to runs should trigger some async operation
// that has a side-effect that can be tested for. In this case,
// lets say that the doSomethingAsyncWithCallback function
// does something asynchronously and then calls the passed callback
doSomethingAsyncWithCallback(function() { isDone = true; });
});
waitsFor(function() {
// The call to waits for is a polling function that will get called
// periodically until either a condition is met (the function should return
// a boolean testing for this condition) or the timeout expires.
// The optional text is what error to display if the test fails.
return isDone === true;
}, "Should set isDone to true", 500);
runs(function() {
// The second call to runs should contain any assertions you need to make
// after the async call is complete.
expect(isDone).toBe(true);
});
});
});
Edit: Also, Jasmine has several built-in methods of faking out the setTimeout and setInterval functions of the browser without hosing any other tests in your suite that may depend on that. I would take a look at using those rather than manually overriding the setTimeout/setInterval functions.
Basically, JavaScript is single-threaded, so it will block the main thread. But :
I assume you're using setTimesout to schedule your function, so it won't be noticeable to the user if calls to that function don't take too much time (say, less than 200 or 300ms).
If you're doing DOM manipulation during that function (including Canvas or WebGL), then you're screwed. But if not, you can look into Web Workers, which can spawn separate threads that are guaranteed not to block the UI.
But anyway, JavaScript and the main loop, that's a tricky issue that's been bugging me a lot these past months, so you're not alone!
As soon as your function returns (after having set the timeout for it's next run), javascript will look at the next thing that requires running and run that.
As far as I can tell, the 'main thread' in javascript is just a loop that is responding to events (such as onload for a script tag, which runs the contents of that tag).
Based on the above two conditions, the calling thread is always going to run to completion despite any setTimeouts, and those timeouts will begin after the calling thread has nothing left to run.
The way I tested this was to run the following function right after the call to a()
function looper(name,duration) {
var start = (new Date()).getTime();
var elapsed = 0;
while (elapsed < duration) {
elapsed = (new Date()).getTime() - start;
console.log(name + ": " + elapsed);
}
}
Duration should be set to some period of time longer than the setTimeout duration in a(). The expected output would be the output of 'looper', followed by the output of repeated calls to a().
The next thing to test would be whether other script tags are able to run while a() and its child calls are executing.
You can do this like so:
<script>
a();
</script>
<script>
looper('delay',500); // ie; less than the 1000 timeout in a();
</script>
<script>
console.log('OK');
</script>
You would expect 'OK' to appear in the log despite the fact that a() and its children are still executing. You can also test variations of this, such as window.onload(), etc.
Finally, you'd want to ensure that other timer events work fine as well. Simply delaying 2 calls by half a second and checking that they interleave should show that works fine:
function b()
{
console.log("invoked b")
setTimeout(b, 1000);
}
a();
looper('wait',500);
b();
Should produce output like
invoked
invoked b
invoked
invoked b
invoked
invoked b
Hope that's what you were looking for!
EDIT in case you need some technical details on how to do it in Qunit:
If Qunit can't capture console.log output (i'm not sure), just push those strings into an array or a string and check that after it's run. You could override console.log in the test module() setup and restore it at teardown. I'm not sure how Qunit works but 'this' might have to be removed and globals used to store the old_console_log and test_output
// in the setup
this.old_console_log = console.log;
this.test_output = [];
var self = this;
console.log = function(text) { self.test_output.push(text); }
// in the teardown
console.log = this.old_console_log;
Finally, you can utilize stop() and start() so that Qunit knows to wait for all the events in the test to finish running.
stop();
kickoff_async_test();
setTimeout(function(){
// assertions
start();
},<expected duration of run>);
Based on all the answers, I came up with this solution that works for my case:
testAsync("Doesn't hang", function(){
expect(1);
var ranToLong = false;
var last = new Date();
var sched = setInterval(function(){
var now = new Date();
ranToLong = ranToLong || (now - last) >= 50;
last = now;
}, 0);
// In this case, asyncRecursiveFunction runs for a long time and
// returns a single value in callback
asyncRecursiveFunction(function callback(v){
clearInterval(sched);
var now = new Date();
ranToLong = ranToLong || (now - last) >= 50;
assert.equal(ranToLong, false);
start();
});
});
It tests that 'asyncRecursiveFunction' does not hang while processing by looking at the time between another scheduled function calls.
This is really ugly and not be applicable to every case but it seems to work for me because I can restrict my function to some large set of async recursive calls so it runs for a long but not infinite time. As I mentioned in the question, I am happy proving that such cases do not block.
BTW, the actual code in question is found in gen.js. The main problem was an async reduce generator. It correctly returned a value asynchronously, but in previous versions would stall because of synchronous internal implementation.

JQuery / Javascript inline callback

In tornado we have gen module, that allows us to write constructions like this:
def my_async_function(callback):
return callback(1)
#gen.engine
def get(self):
result = yield gen.Task(my_async_function) #Calls async function and puts result into result variable
self.write(result) #Outputs result
Do we have same syntax sugar in jquery or other javascript libraries?
I want something like this:
function get_remote_variable(name) {
val = $.sweetget('/remote/url/', {}); //sweetget automatically gets callback which passes data to val
return val
}
You describe the function as "my_async_function", but the way you use it is synchronous rather than asynchronous.
Your sample code requires blocking -- if "my_async_function" were truly asynchronous (non-blocking), the following line of code self.write(result) would execute immediately after you called "my_async_function". If the function takes any time at all (and I mean any) to come back with a value, what would self.write(result) end up writing? That is, if self.write(result) is executed before result ever has a value, you don't get expected results. Thus, "my_async_function" must block, it must wait for the return value before going forward, thus it is not asynchronous.
On to your question specifically, $.sweetget('/remote/url/', {}): In order to accomplish that, you would have to be able to block until the ajax request (which is inherently asynchronous -- it puts the first A in AJAX) comes back with something.
You can hack a synchronous call by delaying the return of sweetget until the XHR state has changed, but you'd be using a while loop (or something similar) and risk blocking the browser UI thread or setting off one of those "this script is taking too long" warnings. Javascript does not offer threading control. You cannot specify that your current thread is waiting, so go ahead and do something else for a minute. You could contend with that, too, by manually testing for a timeout threshold.
By now one should be starting to wonder: why not just use a callback? No matter how you slice it, Javascript is single-threaded. No sleep, no thread.sleep. That means that any synchronous code will block the UI.
Here, I've mocked up what sweetget would, roughly, look like. As you can see, your browser thread will lock up as soon as execution enters that tight while loop. Indeed, on my computer the ajax request won't even fire until the browser displays the unresponsive script dialog.
// warning: this code WILL lock your browser!
var sweetget = function (url, time_out) {
var completed = false;
var result = null;
var run_time = false;
if (time_out)
run_time = new Date().getTime();
$.ajax({
url: url,
success: function(data) {
result = data;
completed = true;
},
error: function () {
completed = true;
}
}); // <---- that AJAX request was non-blocking
while (!completed) { // <------ but this while loop will block
if (time_out) {
if (time_out>= run_time)
break;
run_time = new Date().getTime();
}
}
return result;
};
var t = sweetget('/echo/json/');
console.log('here is t: ', t);
Try it: http://jsfiddle.net/AjRy6/
Versions of jQuery prior to 1.8 support sync ajax calls via the async: false setting. Its a hack with limitations (no cross-domain or jsonp, locks up the browser), and I would avoid it if possible.
There are several available libraries that provide some syntactic sugar for async operations in Javascript. For example:
https://github.com/caolan/async
https://github.com/coolaj86/futures
...however I don't think anything provides the synchronous syntax you are looking for - there is always a callback involved, because of the way JavaScript works in the browser.

Categories

Resources