We are working with node, mainly for an internal project and to understand the best way to use the technology.
Not coming from a specific asynchronous background the learning curve can be a challenge but we are getting used to the framework and learning the process.
One thing that has polarised us is when the best time to use synchronous code vs asynchronous code is. We are currently using the rule that if anything interacts with IO then it has to be asynchronous via call backs or the event emitter (thats a given), but other items which are not in any way using IO can be constructed as synchronous functions (this will depends as well on the heaviness of the function itself and how blocking it actually is) but is this the best approach to take when working with Node.js?
For instance, we are creating a Hal+JSON builder, which currently exists within our code base. It is synchronous simply because all it is doing is creating some rather smallish object literals and nothing more, there are no external dependencies and certainly no IO interactions.
Is our approach a good one to take or not?
Let's say you have two functions, foo and bar, which are executing synchronously:
function foo() {
var returnValue = bar();
console.log(returnValue);
}
function bar() {
return "bar";
}
In order to make the API "asynchronous" is to change it to use callbacks:
function foo() {
bar(function(returnValue) {
console.log(returnValue);
});
}
function bar(callback) {
callback("bar");
}
But the fact of the matter is, this code is still entirely synchronous. The callback is being executed on the same call stack, and no threading optimizations are being made, no scalability benefits are to be had.
It then becomes a question of code readablity and coding style. I personally find the typical var val = func(); type code more readable and readily understandable. The only drawback is, that if you one day would need to change the functionality of bar so, that it would need to perform some I/O activity or call some other function which is asynchronous, you need to change the API of baras well.
My personal preference: use traditional, synchnous patterns when applicable. Always use asynchronous style when I/O is involved or when in doubt.
Turning a synchronous function into an asynchronous one using process.nextTick() is an option, but one you should only use if your software is blocking for too long. Every time your synchronous function is running node can't do anything else as it's single threaded. This means that if your application is a server, it becomes unresponsive.
Before you split up any synchronous function, I would advise to first do some benchmarking and profiling so you can make an informed decision. Otherwise you run the risk of dong a premature optimisation
See here for a good discussion https://softwareengineering.stackexchange.com/questions/80084/is-premature-optimization-really-the-root-of-all-evil
Here's how you can make your function asyncronous, in the sense that you will allow node to do other things http://howtonode.org/understanding-process-next-tick
I don't what's "Hal+Json builder", but here's what I think about NodeJS. You should write your code as asynchronous as possible, pushing it to it's limits. The simple reason is that asynchronous code trades performance for responsiveness, which is in most cases more important.
Of course if certain operations are really quick, then there is no need for asynchronous code. For example consider this code:
var arr = []; // assume array is nonempty
for (var i = 0, l = arr.length; i < l; i++) {
arr[ i ].doStuff( );
}
If arr is small and doStuff is a quick operation, then you should not bother in writing this code in asynchronous way. However if it takes some time, then you should consider writing it like that:
var arr = []; // assume array is nonempty
var process_array_element = function( ) {
if (arr.length) {
arr.pop().doStuff( );
process.nextTick( process_array_element );
}
};
process_array_element( );
Now this code is truely asynchronous. It will take even more time for it to complete the job, but in the meantime it won't block entire server. We've traded performance for responsiveness.
Conclusion: It depends on your situation! :)
Related
Although I'm not having a practical problem, I'm in need of some explanation about what makes something asynchronous in standard Javascript in NodeJs. Take for example this piece of code:
function pause(ms) {
var dt = new Date()
while((new Date()) - dt <= ms) {
}
console.log("Third")
}
console.log("First")
pause(3000)
console.log("second")
The output is
First
Third
second
Since NodeJs is asynchronous I would have expected the outcome to be "First Second Third". The following code however is asynchronous:
console.log("First")
setTimeOut(function(){
console.log("Third")
}, 3000)
console.log("Second")
The output is:
First
Second
Third
The problem is I do not fully understand why. Is it because while loops are blocking, even though I've put it inside a function? If so, what else is blocking and what isn't? Has it something to do with the event loop? Are only I/O operations asynchronous? Is there an overview of what is and isn't or is there a general principle?
The JS/Node.js language by default is synchronous (or blocking if you prefer) - as are most programming languages - in that lines of code are executed one after another, which is why the output of the first example is "First", "Third", "Second".
console.log("First"); // Outputs "First"
pause(3000); // Outputs "Third"
console.log("second"); // Outputs "Second"
That is pause() is just a function that is called and executed synchronously; there is nothing special about it.
In the second example the languages event-driven model has been used via setTimeout(), which accepts a callback function as a parameter that is called when the timer expires. Which means the sequence of outputs is different.
So to answer your question: standard JavaScript code is synchronous, but many of the standard library functions are non-blocking. As indeed your own can be if written accordingly.
Which standard library functions? Well I/O and timers generally are non-blocking, but it really is a case of reading the documentation.
Suppose we have these two async functions:
function myfu(f) {
setTimeout(function(){ f(true) }, 200)
}
function myfu2(x, f) {
setTimeout(function(){ f(500 + x) }, 200)
}
I was thinking about this syntax:
if (myfu.()) {
var d = myfu2.(55)
console.log(d) // outputs 555
}
which is amazingly simpler to type and read that this:
myfu(function (DOTCALL1) {
if (DOTCALL1) {
myfu2(55, function (DOTCALL2) {
d = DOTCALL2
console.log(d) // outputs 555
})
}
})
The conversion from .( syntax could be easily implemented by means of search/replace/regex.
I noticed that most of the time the result of my asynchronous functions is not used, and after it was called, the upper functions also exits immediately.
Although sometimes the upper function can do something after it called the first async(callback), usually it does nothing. Sometimes the asynchronous function could return something useful, but most of the time it just returns undefined and real stuff is returned with callback(result).
So I thought that maybe I could modify the code some how to make it easier to type and read. Which is illustrated by the example above.
My question is sipmle: is there already similar solution? I do not want to reinvent the wheel (if it is already invented). Or, if there is no such project, why not, why this approach, which seems so easy to implement and supposed to simplify development significantly is not used?
Please do not respond with the suggestion to "take a look at async.js". I am not asking about asynchronous programming patterns, asynchronous programming in general, not about arrow function notation =>, not about coffeScript. I am asking about a code conversion on the text level from proposed original syntax form to well known async ladder, aka "callback hell" notation.
What you are describing is very similar to streamline.js: https://github.com/Sage/streamlinejs.
I wrote it but I got the idea from narrative.js: See http://www.neilmix.com/narrativejs/doc/
So your intuition is right. This can be done. It is nevertheless a little more complicated than a few search/replace/regex operations, at least if you want to go beyond the obvious and support async calls in all JavaScript constructs. For example, try to rewrite:
while (asyncF1.() && asyncF2.()) asyncF3.();
I described the streamline transformation algorithm in a blog post: https://bjouhier.wordpress.com/2011/05/24/yield-resume-vs-asynchronous-callbacks/
I have this site bookmarked since last year where the author blogs about EcmaScript6 which allegedly allows you to obtain something similar to .NET's async / await pattern (so allowing code to invoke asynchronous methods like a "normal" one). Here is the link
ES6 introduces two small extensions to the language syntax:
function*: the functions that you declare with a little twinkling star are generator functions. They execute in an unusual way and return generators.
yield: this keyword lets you transfer control from a generator to the function that controls it.
And, even though these two language constructs were not orginally designed to have the async/await semantics found in other languages, it is possible to give them these semantics:
The * in function* is your async keyword.
yield is your await keyword.
Knowing this, you can write asynchronous code as if JavaScript had async/await keywords.
I never had the chance to try it out though, so YMMV. It sounds useful.
Can you explain me the following phrase (taken from an answer to Stack Overflow question What are the differences between Deferred, Promise and Future in Javascript?)?
What are the pros of using jQuery promises against using the previous jQuery callbacks?
Rather than directly passing callbacks to functions, something which
can lead to tightly coupled interfaces, using promises allows one to
separate concerns for code that is synchronous or asynchronous.
A promise is an object that represents the result of an asynchronous operation, and because of that you can pass it around, and that gives you more flexibility.
If you use a callback, at the time of the invocation of the asynchronous operation you have to specify how it will be handled, hence the coupling. With promises you can specify how it will be handled later.
Here's an example, imagine you want to load some data via ajax and while doing that you want to display a loading page.
With callbacks:
void loadData = function(){
showLoadingScreen();
$.ajax("http://someurl.com", {
complete: function(data){
hideLoadingScreen();
//do something with the data
}
});
};
The callback that handles the data coming back has to call hideLoadingScreen.
With promises you can rewrite the snippet above so that it becomes more readable and you don't have to put the hideLoadingScreen in the complete callback.
With promises
var getData = function(){
showLoadingScreen();
return $.ajax("http://someurl.com").promise().always(hideLoadingScreen);
};
var loadData = function(){
var gettingData = getData();
gettingData.done(doSomethingWithTheData);
}
var doSomethingWithTheData = function(data){
//do something with data
};
UPDATE: I've written a blog post that provides extra examples and provides a clear description of what is a promise and how its use can be compared to using callbacks.
The coupling is looser with promises because the operation doesn't have to "know" how it continues, it only has to know when it is ready.
When you use callbacks, the asynchronous operation actually has a reference to its continuation, which is not its business.
With promises, you can easily create an expression over an asynchronous operation before you even decide how it's going to resolve.
So promises help separate the concerns of chaining events versus doing the actual work.
I don't think promises are more or less coupled than callbacks, just about the same.
Promises however have other benefits:
If you expose a callback, you have to document whether it will be called once (like in jQuery.ajax) or more than once (like in Array.map). Promises are called always once.
There's no way to call a callback throwing and exception on it, so you have to provide another callback for the error case.
Just one callback can be registered, more than one for promises, and you can register them AFTER the event and you will get called anyway.
In a typed declaration (Typescript), Promise make easier to read the signature.
In the future, you can take advantage of an async / yield syntax.
Because they are standard, you can make reusable components like this one:
disableScreen<T>(promiseGenerator: () => Promise<T>) : Promise<T>
{
//create transparent div
return promiseGenerator.then(val=>
{
//remove transparent div
return val;
}, error=>{
//remove transparent div
throw error;
});
}
disableScreen(()=>$.ajax(....));
More on that: http://www.html5rocks.com/en/tutorials/es6/promises/
EDIT:
Another benefit is writing a sequence of N async calls without N levels of indentation.
Also, while I still don't think it's the main point, now I think they are a little bit more loosely coupled for this reasons:
They are standard (or at least try): code in C# or Java that uses strings are more lousy coupled than similar code in C++, because the different implementations of strings there, making it more reusable. Having an standard promise, the caller and the implementation are less coupled to each other because they don't have to agree on a (pair) of custom callbacks with custom parameters orders, names, etc... The fact that there are many different flavors on promises doesn't help thought.
They promote a more expression-based programming, easier to compose, cache, etc..:
var cache: { [key: string] : Promise<any> };
function getData(key: string): Promise<any> {
return cache[key] || (cache[key] = getFromServer(key));
}
you can argue that expression based programming is more loosely coupled than imperative/callback based programming, or at least they pursue the same goal: composability.
Promises reify the concept of delayed response to something.
They make asynchronous computation a first-class citizen as you can pass it around. They allow you to define structure if you want - monadic structure that is - upon which you can build higher order combinators that greatly simplify the code.
For example you can have a function that takes an array of promises and returns a promise of an array(usually this is called sequence). This is very hard to do or even impossible with callbacks. And such combinators don't just make code easier to write, they make it much easier to read.
Now consider it the other way around to answer your question. Callbacks are an ad-hoc solution where promises allow for clearer structure and re-usability.
They aren't, this is just a rationalization that people who are completely missing the point of promises use to justify writing a lot more code than they would write using callbacks. Given that there is obviously no benefit in doing this, you can at least always tell yourself that the code is less coupled or something.
See what are promises and why should I use them for actual concrete benefits.
I have a simple program in node.js, such as:
// CODE1
step1();
step2();
var o = step3();
step4(o);
step5();
step6();
this program is meant to be run in a stand-alone script (not in a web browser),
and it is a sequential program where order of execution is important (eg, step6 needs to be executed after step5).
the problem is that step3() is an async function, and the value 'o' is actually passed to a given callback,
so I would need to modify the program as follows:
// CODE2
step1();
step2();
step3( function (o) {
step4(o);
step5();
step6();
})
it could make sense to call step4 from the callback function, because it depends on the 'o' value computed by step3.
But step5 and step6 functions do not depend on 'o', and I have to include them in that callback function only to preserve the order of execution: step3, then step4, then step5 then step6.
this sounds really bad to me.
this is a sequential program, and so I would like to convert step3 to a sync function.
how to do this?
I am looking for something like this (eg using Q):
// CODE3
step1();
step2();
var deferred = Q.defer();
step3(deferred.resolve);
deferred.blockUntilFulfilled() // this function does not exist; this is what i am looking for
var o = deferred.inspect().value
step4(o);
step5();
step6();
How to do this?
ps: there are advantages and disadvantages of using sync or async, and this is a very interesting discussion. however, it is not the purpose of this question. in this question, i am asking how can i block/wait until a promise (or equivalent) gets fulfilled. Please, please, please, do not start a discussion on whether sync/blocking is good or not.
It's impossible to turn an async operation into a sync operation in vanilla JavaScript. There are things like node-fibers (a C++ add-on) that will allow this, or various compile-to-JS languages that will make the async operations look sync (essentially by rewriting your first code block to the second), but it is not possible to block until an async operation completes.
One way to see this is to note that the JavaScript event loop is always "run to completion," meaning that if you have a series of statements, they will always be guaranteed to execute one after the other, with nothing in between. Thus there is no way for an outside piece of information to come in and tell the code to stop blocking. Say you tried to make it work like so:
stepA();
stepB();
while (global.operationIsStillGoing) {
// do nothing
}
stepC();
This will never work, because due to the run-to-completion nature of JavaScript, it is not possible for anything to update global.operationIsStillGoing during the while loop, since that series of statements has not yet run to completion.
Of course, if someone writes a C++ addon that modifies the language, they can get around this. But that's not really JavaScript any more, at least in the commonly understood sense of ECMAScript + the event loop architecture.
As a rule of thumb, which of these methods of writing cross-browser Javascript functions will perform better?
Method 1
function MyFunction()
{
if (document.browserSpecificProperty)
doSomethingWith(document.browserSpecificProperty);
else
doSomethingWith(document.someOtherProperty);
}
Method 2
var MyFunction;
if(document.browserSpecificProperty) {
MyFunction = function() {
doSomethingWith(document.browserSpecificProperty);
};
} else {
MyFunction = function() {
doSomethingWith(document.someOtherProperty);
};
}
Edit: Upvote for all the fine answers so far. I've fixed the function to a more correct syntax.
Couple of points about the answers so far - whilst in the majority of cases it is a fairly pointless performance enhancement, there are a few reasons one might want to still spend some time analyzing the code:
Has to run on
slow computers, mobile devices, old
browsers etc.
Curiosity
Use the same
general principal to performance
enhance other scenarios where the
evaluation of the IF statement does
take some time.
Unless you're doing this a trillion times, it doesn't matter. Go with the one that is more readable and maintainable to you and/or your organization. The productivity gains you will get from writing clean, simple code matters way more than shaving a tenth of a microsecond off your JS execution time.
You should only even start thinking about what performs better when and only when you've written code and it is unacceptably slow. Then you should start tracking down the bottleneck, which will never be something like this. You will never get a measurable performance gain out of switching from one to the other here.
Unfortunately the code above is not actually cross-browser friendly as it relies on a mozilla quirk not present in other browsers -- namely that function statements are treated as function expressions inside branches. On browsers other that aren't built on mozilla the above code will always use the second function definition. I made a simple testcase to demonstrate this here.
Basically the ECMAScript spec says that function statements are treated similarly to var declarations, eg. they all get hoisted to the top of the current execution scope (eg. the start of a <script> tag, the start of a function, or the start of an eval block).
To clarify olliej's answer, your second method is technically a syntax error. You could rewrite it this way:
var MyFunction;
if(document.browserSpecificProperty) {
MyFunction = function() {
doSomethingWith(document.browserSpecificProperty);
};
} else {
MyFunction = function() {
doSomethingWith(document.someOtherProperty);
};
}
Which is at least correct syntax, but note that MyFunction would only be available in the scope in which that occurs. (Omit var MyFunction;, and preferably use window.MyFunction = function() ... for global.)
Technically, I would say that the second one would perform better, because the if statement is only executed once, rather than every time the function is run.
The difference, however, would be negligible to the point of being meaningless. The performance penalty of a single if statement such as this would be insignificant even compared to the performance penalty of simply calling a function. It would make a smallish difference even if if is called a million times.
The first one is easier to understand, because it doesn't have the awkwardness of defining the same function twice based on a condition, with both versions behaving differently. That seems to be a recipe for confusion later on.
I wouldn't be the first person to say that unless you are really insane about this optimization thing, you'll get more of a win out of code readability.
I generally prefer the second version, as the condition only has to be evaluated once and not on every call, but there are times when it's not really feasible because it will hamper readability.
Btw, this is a case where you might want to use the ?: operator, e.g (taken from production code):
var addEvent =
document.addEventListener ? function(type, listener) {
document.addEventListener(type, listener, false);
} :
document.attachEvent ? function(type, listener) {
document.attachEvent('on' + type, listener);
} :
throwError;
For your simplified example I would do what's below assuming that your browser property check only needs to be done once:
var MyFunction = (function() {
var rightProperty = document.browserSpecificProperty || document.someOtherProperty;
return function doSomethingWith() {
// use the rightProperty variable in your function
}
})();
The performance should be nearly equal!
Thing about using Frameworks like JQuery to get rid of the Browser compability problems!
If performance is your main goal, have a look at SlickSpeed! It is a page which benchmarks different JavaScript frameworks!