ECMA6 - Use of Generator Function in JavaScript - javascript

Its been a few days that I got to learn about Generator functions in JavaScript which are introduced in ES6 version.
There were various places generators were explained but what seems too intriguing for me is the fact that everywhere it is said that
A generator function is a way to write async code synchronously.
The question that I want to raise is "Why is there so much need to introduce a completely different programming strategy just for the sake of it?"
I understand the async nature of the JS code makes it difficult for a newbie to understand and debug the code but does it require a complete change in coding style altogether?
I maybe wrong or not completely understanding the concept behind its introduction but being inquisitive about it all is what compelled me to ask this question.

Because closures are less convenient for simple iteration; simplifying syntax for reasonably common tasks can be worth it, even if the language supported the same pattern before. Just compare:
function chain() {
var args = Array.from(arguments);
return function() {
if (args.length === 0) return undefined; // Or some other sentinel
var nextval = args[0].shift(); // Destructive to avoid copies or more closure vars
if (args[0].length === 0) args.shift();
return nextval;
};
}
var x;
// a, b and c must be indexable, e.g. Arrays; we can't handle other closures without
// requiring some API specific protocol for generation
for (var nextchain = chain(a, b, c); (x = nextchain()) !== undefined;) {
// do stuff with current value
}
to:
function* chain() {
for (var i = 0; i < arguments.length; ++i)
yield* arguments[i];
}
// a, b and c can be any iterable object; yield* can handle
// strings, Arrays, other generators, etc., all with no special handling
for (var x of chain(a, b, c)) {
// do stuff with current value
}
Sure, the savings in lines of code aren't incredible. It's mostly just reducing boilerplate and unnecessary names, removing the need to deal with closures for simple cases, and with the for...of syntax, providing a common mechanism to iterate arbitrary iterable things, rather than requiring the user to explicitly construct the initial closure and advance it by name. But if the pattern is common enough, that's useful enough.
As noted in comments, a, b, c must be Array-like for the closure based approach (or you'd use a different closure based approach where the writer of chain imposes arbitrary requirements on stuff passed to it, with special cases for Array-like stuff vs. generator-like closures) and processing is destructive (you'd need to add more closure state or make copies to make it non-destructive, making it more complex or slower); for the generator-based approach with yield*, no special cases required. This makes generators composable without complex specs; they can build on one another easily.

Scenario 1 (Async) : You might have heard of the importance of writing “non-blocking” javascript. When we do an I/O operation, we use callbacks or promises in javascript to write non blocking javascript code.
Scenario 2 (Sync): running infinite loop in javascript eg: node -e 'while(true) {}' will possibly freeze your computer
With all of this in mind, ES6 Generators allow us to effectively “pause” execution in the middle of a function and resume it at some time in the future (async code synchronously)
Use case: Imagine you need to do something with a sequence of infinite values. Arrays won’t be helpful in that case instead we could use ES6 generator functions
var iterator = generateRandoms(); //suppose it is a generator function which loops through infinite sequence
//generator functions returns a next function which can be called anytime in your code to get the next value from the sequence
console.log(iterator.next()); // { value: 0.4900301224552095, done: false }
console.log(iterator.next()); // { value: 0.8244022422935814, done: false }
And as far as complexity is concerned, its a new syntax but would not take long to grasp it.
For further reading:
http://x-team.com/2015/04/generators-work/
https://msdn.microsoft.com/en-in/library/dn858237(v=vs.94).aspx

Actually, the generator function is not so much about asynchronous, it is more about a function that can be interrupted, how the flow is interrupted is determined by the caller of generator, through iterator -- more explanation below
function* ab3() {
console.log(1);
var c = yield 2;
console.log(2);
return 1 + c;
console.log(3);
}
a generator function is a function that can be truncated in between
it's execution starts when iterator.next() is called, a generator function returns iterator when called
first call of next will run the statement of generator function till the first yield and it will return the yielded value
second call of iterator.next will run till next yield or return statement
so it will take in the data passed through next(3) // (3 in this example) and store it in variable (variable to which the yield is assigned), var y = yield 2; so 3 will be stored in y; and statement execution till the return statment., it will return {done:true} over here because we have reached the end of the generator function
next execution ( 3rd here) of iterator.next() will return {value: undefined, done:true}, since nothing is further returned by generator function

Related

How to avoid Javascript Iterators compute before using array method in chaining

generators/iterators allow to manipulate sequence of object without having to first build all the sequence item. Thus it can save memory space on the machine.
Array methods like map, reduce, find, forEach, etc. are essential in functional programming because it allows to write code as a sequence of pure function, easier to testing, etc.
I would like a fair solution that let me use generators/iterators and the array functions.
Let's have a look at this code:
function* myGen() {
yield 1;
yield 2;
yield 3;
throw new Error('oups');
}
const y = [...myGen()].find(x => x > 2);
assert.deepStrictEqual(y, 3);
This will not work because the spreading first occurs, taking all the memory space, and the generator function will go until its end and throw the exception.
In this example, I was hoping that Javascript would execute the code of the find function first, foreach element before each next iterator computation.
I would like a solution that allows me to use the array methods, without building a full array first. Is it possible in Javascript?

Why does Javascript `iterator.next()` return an object?

Help! I'm learning to love Javascript after programming in C# for quite a while but I'm stuck learning to love the iterable protocol!
Why did Javascript adopt a protocol that requires creating a new object for each iteration? Why have next() return a new object with properties done and value instead of adopting a protocol like C# IEnumerable and IEnumerator which allocates no object at the expense of requiring two calls (one to moveNext to see if the iteration is done, and a second to current to get the value)?
Are there under-the-hood optimizations that skip the allocation of the object return by next()? Hard to imagine given the iterable doesn't know how the object could be used once returned...
Generators don't seem to reuse the next object as illustrated below:
function* generator() {
yield 0;
yield 1;
}
var iterator = generator();
var result0 = iterator.next();
var result1 = iterator.next();
console.log(result0.value) // 0
console.log(result1.value) // 1
Hm, here's a clue (thanks to Bergi!):
We will answer one important question later (in Sect. 3.2): Why can iterators (optionally) return a value after the last element? That capability is the reason for elements being wrapped. Otherwise, iterators could simply return a publicly defined sentinel (stop value) after the last element.
And in Sect. 3.2 they discuss using Using generators as lightweight threads. Seems to say the reason for return an object from next is so that a value can be returned even when done is true! Whoa. Furthermore, generators can return values in addition to yield and yield*-ing values and a value generated by return ends up as in value when done is true!
And all this allows for pseudo-threading. And that feature, pseudo-threading, is worth allocating a new object for each time around the loop... Javascript. Always so unexpected!
Although, now that I think about it, allowing yield* to "return" a value to enable a pseudo-threading still doesn't justify returning an object. The IEnumerator protocol could be extended to return an object after moveNext() returns false -- just add a property hasCurrent to test after the iteration is complete that when true indicates current has a valid value...
And the compiler optimizations are non-trivial. This will result in quite wild variance in the performance of an iterator... doesn't that cause problems for library implementors?
All these points are raised in this thread discovered by the friendly SO community. Yet, those arguments didn't seem to hold the day.
However, regardless of returning an object or not, no one is going to be checking for a value after iteration is "complete", right? E.g. most everyone would think the following would log all values returned by an iterator:
function logIteratorValues(iterator) {
var next;
while(next = iterator.next(), !next.done)
console.log(next.value)
}
Except it doesn't because even though done is false the iterator might still have returned another value. Consider:
function* generator() {
yield 0;
return 1;
}
var iterator = generator();
var result0 = iterator.next();
var result1 = iterator.next();
console.log(`${result0.value}, ${result0.done}`) // 0, false
console.log(`${result1.value}, ${result1.done}`) // 1, true
Is an iterator that returns a value after its "done" is really an iterator? What is the sound of one hand clapping? It just seems quite odd...
And here is in depth post on generators I enjoyed. Much time is spent controlling the flow of an application as opposed to iterating members of a collection.
Another possible explanation is that IEnumerable/IEnumerator requires two interfaces and three methods and the JS community preferred the simplicity of a single method. That way they wouldn't have to introduce the notion of groups of symbolic methods aka interfaces...
Are there under-the-hood optimizations that skip the allocation of the object return by next()?
Yes. Those iterator result objects are small and usually short-lived. Particularly in for … of loops, the compiler can do a trivial escape analysis to see that the object doesn't face the user code at all (but only the internal loop evaluation code). They can be dealt with very efficiently by the garbage collector, or even be allocated directly on the stack.
Here are some sources:
JS inherits it functionally-minded iteration protocol from Python, but with results objects instead of the previously favoured StopIteration exceptions
Performance concerns in the spec discussion (cont'd) were shrugged off. If you implement a custom iterator and it is too slow, try using a generator function
(At least for builtin iterators) these optimisations are already implemented:
The key to great performance for iteration is to make sure that the repeated calls to iterator.next() in the loop are optimized well, and ideally completely avoid the allocation of the iterResult using advanced compiler techniques like store-load propagation, escape analysis and scalar replacement of aggregates. To really shine performance-wise, the optimizing compiler should also completely eliminate the allocation of the iterator itself - the iterable[Symbol.iterator]() call - and operate on the backing-store of the iterable directly.
Bergi answered already, and I've upvoted, I just want to add this:
Why should you even be concerned about new object being returned? It looks like:
{done: boolean, value: any}
You know, you are going to use the value anyway, so it's really not an extra memory overhead. What's left? done: boolean and the object itself take up to 8 bytes each, which is the smallest addressable memory possible and must be processed by the cpu and allocated in memory in a few pico- or nanoseconds (I think it's pico- given the likely-existing v8 optimizations). Now if you still care about wasting that amount of time and memory, than you really should consider switching to something like Rust+WebAssembly from JS.

Conditional evaluation of callback arguments

Recently I've been getting into the javascript ecosystem. After sometime with javascript's callbacks I started asking myself if the javascript interpreters are capable of doing conditional evaluation of callback arguments. Let's take the following two example:
var a = 1;
var b = 2;
// example 1
abc.func(a, b, function (res) {
// do something with res
});
// example 2
abc.func(a, b, function () {
// do something
});
From what I understand, Javascript uses the arguments object to keep track of what is passed into a function. This is regardless of what the function definition is. So assuming that:
abc.func = function (a, b, cb) {
// do stuff
var res = {};
// Expensive computation to populate res
cb(res);
}
In both examples (1, 2) the res object will be passed to arguments[0]. In example 1 res === arguments[0] since the res parameter is defined.
Let's assume that computing res is expensive. In example 1 it's ok to go through this computation since the res object is used. In example 2, since the res object is not used, there really is no point in doing that computation. Although, since the arguments object needs to be populated, in both cases the computation to populate res is done. Is this correct ?
Assuming that's true, this seems like a (potentially) huge waste. Why compute something that's going to go out of scope and be garbage collected ? Think of all the libraries out there that use callbacks. A lot of them send multiple arguments back to the callback function, but sometimes none of them are used.
Is there a way to prevent this behaviour. Essentially make the Javascript interpreter smart enough to not compute those specific variables that will turn into unused arguments ? So in example 2 the res object would not actually be computed since it will never actually be used.
I understand that until this point things like this were used:
function xyz(a, b /*, rest */)
// use arguments to iterate over rest
}
So by default it makes sense to still compute those arguments. Now let's look forward to ECMAScript 2015. This will include the ...rest parameter to be defined. So for engines that support the new version, is there a way to enable conditional evaluation? This would make much more sense, since now there is a way to explicitly ask to evaluate and pass in all extra arguments to a function.
No, JavaScript is not a lazy call-by-name language. This is mostly because expressions can have side effects, and the ES standard requires them to be executed in the order the programmer expects them.
Yes, JS engines are smart. If they do detect that code does not execute side effects, and its results are not used anywhere, it just dumps them (dead code elimination). I'm not sure whether this works across function boundaries, I guess it doesn't, but if you are in a hot code path and the call does get inlined, it might be optimised.
So if you know that you are doing a heavy computation, you may want to make it lazy explicitly by passing a thunk. In an eagerly evaluated language, this is typically simply represented by a function that takes no parameters. In your case:
abc.func = function (a, b, cb) {
// do stuff
var res = {};
cb(() => {
// Expensive computation to populate res
return res;
});
}
// example 1
abc.func(a, b, function(getRes) {
var res = getRes();
// do something with res
});
// example 2
abc.func(a, b, function() {
// no heavy computation
// do something
});
You couldn't do that on an interpreter level, it's not feasible to determine whether or not computing a argument was dependent on computing another argument, and even if you could this would create inconsistent behaviour for the user. And because passing in variables into a function is extremely cheap, this becomes a pointless exercise.
It could be done on a functional level - if you wanted to you could pass the expected arguments of the callback as a parameter to the function, thereby augmenting the behaviour of the function based on the parameters, which is commonplace.

Javascript recursive function inside a for loop

var f_drum_min = function myself(a){
alert(a);
$f_min_node.push(a);
for (i=0;i<=$m;i++){
if ($f_leg[i][1]==a){
myself($f_leg[i][0]);
}
}
};
myself($f_leg[i][0]); breaks the for loop , how can I make it run multiple times in loop?
Your function is riddled with bad habits
There's no way for me to improve this function because I have no idea what all of those external states do. Nor is it immediately apparent what their data types are.
These are bad habits because there's no possible way to know the effect of your function. Its only input is a, yet the function depends on $f_min_node, $f_leg, and $m.
What is the value of those variables at the time you call your function?
What other functions change those values?
I assigned $f_min_node to some value and then called f_drum_min. How was I supposed to know that $f_min_node was going to get changed?
Every time you call your function, it's a big surprise what happens as a result. These are the woes of writing non-deterministic ("impure") functions.
Until you can fix these problems, recursion in a for-loop the least of your concerns
I have annotated your code with comments here
// bad function naming. what??
var f_drum_min = function myself(a){
// side effect
alert(a);
// external state: $f_min_node
// mutation: $f_min_node
$f_min_node.push(a);
// leaked global: i
// external state: $m
for (i=0;i<=$m;i++){
// external state: $f_leg
// loose equality operator: ==
if ($f_leg[i][1]==a){
myself($f_leg[i][0]);
}
}
};
I can help you write a deterministic recursive function that uses a linear iterative process though. Most importantly, it doesn't depend on any external state and it never mutates an input.
// Number -> Number
var fibonacci = function(n) {
function iter(i, a, b) {
if (i === 0)
return a;
else
return iter(i-1, b, a+b);
}
return iter(n, 0, 1);
}
fibonacci(6); // 8
for loops are pretty primitive; Imperative programmers will reach for it almost immediately thinking it's the only way to solve an iteration problem.
I could've used a for loop in this function, but thinking about the problem in a different way allows me to express it differently and avoid the for loop altogether.
One basic problem with the code, which would cause it to break under almost any circumstances, is that the loop variable i is a global, and is thus shared by all recursive invocations of the function.
For example, let's say the function is invoked for the first time. i is 0. Now it recurses, and let's say that the condition in the if is never true. At the end of the 2nd call, i = $m + 1. When you return to the first call, because i is global, the loop in the first call ends. I assume this is not what you want.
The fix for this is to declare i as local:
for (var i=0;i<=$m;i++){
This may or may not fix all of your problems (as pointed out in comments, we'd have to see more of your code to identify all possible issues), but it is a critical first step.

What are ES6 generators and how can I use them in node.js?

I was at a node.js meetup today, and someone I met there said that node.js has es6 generators. He said that this is a huge improvement over callback style programming, and would change the node landscape. Iirc, he said something about call stack and exceptions.
I looked them up, but haven't really found any resource that explains them in a beginner-friendly way. What's a high-level overview of generators, and how are the different (or better?) than callbacks?
PS: It'd be really helpful if you could give a snippet of code to highlight the difference in common scenarios (making an http request or a db call).
Generators, fibers and coroutines
"Generators" (besides being "generators") are also the basic buildings blocks of "fibers" or "coroutines". With fibers, you can "pause" a function waiting for an async call to return, effectively avoiding to declare a callback function "on the spot" and creating a "closure". Say goodbye to callback hell.
Closures and try-catch
...he said something about call stack and exceptions
The problem with "closures" is that even if they "magically" keep the state of the local variables for the callback, a "closure" can not keep the call stack.
At the moment of callback, normally, the calling function has returned a long time ago, so any "catch" block on the calling function cannot catch exceptions in the async function itself or the callback. This presents a big problem. Because of this, you can not combine callbacks+closures with exception catching.
Wait.for
...and would change the node landscape
If you use generators to build a helper lib like Wait.for-ES6 (I'm the author), you can completely avoid the callback and the closure, and now "catch blocks" work as expected, and the code is straightforward.
It'd be really helpful if you could give a snippet of code to highlight the difference in common scenarios (making an http request or a db call).
Check Wait.for-ES6 examples, to see the same code with callbacks and with fibers based on generators.
UPDATE 2021: All of this has been superseded by javascript/ES2020 async/await. My recommendation is to use Typescript and async/await (which is based on Promises also standardized)
Generators is one of many features in upcoming ES6. So in the future it will be possible to use them in browsers (right now you can play with them in FF).
Generators are constructors for iterators. Sounds like gibberish, so in easier terms they allow to create objects that later will be possible to iterate with something like for loops using .next() method.
Generators are defined in a similar way to functions. Except they have * and yield in them. * is to tell that this is generator, yield is similar to return.
For example this is a generator:
function *seq(){
var n = 0;
while (true) yield n++;
}
Then you can use this generator with var s = seq(). But in contrast to a function it will not execute everything and give you a result, it will just instantiate the generator. Only when you will run s.next() the generator will be executed. Here yield is similar to return, but when the yield will run, it will pause the the generator and continues to work on the next expression after next. But when the next s.next() will be called, the generator will resume its execution. In this case it will continue doing while loop forever.
So you can iterate this with
for (var i = 0; i < 5; i++){
console.log( s.next().value )
}
or with a specific of construct for generators:
for (var n of seq()){
if (n >=5) break;
console.log(n);
}
These are basics about generators (you can look at yield*, next(with_params), throw() and other additional constructs). Note that it is about generators in ES6 (so you can do all this in node and in browser).
But how this infinite number sequence has anything to do with callback?
Important thing here is that yield pauses the generator. So imagine you have a very strange system which work this way:
You have database with users and you need to find the name of a user with some ID, then you need to check in your file system the key for a this user's name and then you need to connect to some ftp with user's id and key and do something after connection. (Sounds ridiculous but I want to show nested callbacks).
Previously you would write something like this:
var ID = 1;
database.find({user : ID}, function(userInfo){
fileSystem.find(userInfo.name, function(key){
ftp.connect(ID, key, function(o){
console.log('Finally '+o);
})
})
});
Which is callback inside callback inside callback inside callback. Now you can write something like:
function *logic(ID){
var userInfo = yield database.find({user : ID});
var key = yield fileSystem.find(userInfo.name);
var o = yield ftp.connect(ID, key);
console.log('Finally '+o);
}
var s = logic(1);
And then use it with s.next(); As you see there is no nested callbacks.
Because node heavily uses nested callbacks, this is the reason why the guy was telling that generators can change the landscape of node.
A generator is a combination of two things - an Iterator and an Observer.
Iterator
An iterator is something when invoked returns an iterable which is something you can iterate upon. From ES6 onwards, all collections (Array, Map, Set, WeakMap, WeakSet) conform to the Iterable contract.
A generator(iterator) is a producer. In iteration the consumer PULLs the value from the producer.
Example:
function *gen() { yield 5; yield 6; }
let a = gen();
Whenever you call a.next(), you're essentially pull-ing value from the Iterator and pause the execution at yield. The next time you call a.next(), the execution resumes from the previously paused state.
Observer
A generator is also an observer using which you can send some values back into the generator. Explained better with examples.
function *gen() {
document.write('<br>observer:', yield 1);
}
var a = gen();
var i = a.next();
while(!i.done) {
document.write('<br>iterator:', i.value);
i = a.next(100);
}
Here you can see that yield 1 is used like an expression which evaluates to some value. The value it evaluates to is the value sent as an argument to the a.next function call.
So, for the first time i.value will be the first value yielded (1), and when continuing the iteration to the next state, we send a value back to the generator using a.next(100).
Where can you use this in Node.JS?
Generators are widely used with spawn (from taskJS or co) function, where the function takes in a generator and allows us to write asynchronous code in a synchronous fashion. This does NOT mean that async code is converted to sync code / executed synchronously. It means that we can write code that looks like sync but internally it is still async.
Sync is BLOCKING; Async is WAITING. Writing code that blocks is easy. When PULLing, value appears in the assignment position. When PUSHing, value appears in the argument position of the callback
When you use iterators, you PULL the value from the producer. When you use callbacks, the producer PUSHes the value to the argument position of the callback.
var i = a.next() // PULL
dosomething(..., v => {...}) // PUSH
Here, you pull the value from a.next() and in the second, v => {...} is the callback and a value is PUSHed into the argument position v of the callback function.
Using this pull-push mechanism, we can write async programming like this,
let delay = t => new Promise(r => setTimeout(r, t));
spawn(function*() {
// wait for 100 ms and send 1
let x = yield delay(100).then(() => 1);
console.log(x); // 1
// wait for 100 ms and send 2
let y = yield delay(100).then(() => 2);
console.log(y); // 2
});
So, looking at the above code, we are writing async code that looks like it's blocking (the yield statements wait for 100ms and then continue execution), but it's actually waiting. The pause and resume property of generator allows us to do this amazing trick.
How does it work ?
The spawn function uses yield promise to PULL the promise state from the generator, waits till the promise is resolved, and PUSHes the resolved value back to the generator so it can consume it.
Use it now
So, with generators and spawn function, you can clean up all your async code in NodeJS to look and feel like it's synchronous. This will make debugging easy. Also the code will look neat.
BTW, this is coming to JavaScript natively for ES2017 - as async...await. But you can use them today in ES2015/ES6 and ES2016 using the spawn function defined in the libraries - taskjs, co, or bluebird
Summary:
function* defines a generator function which returns a generator object. The special thing about a generator function is that it doesn't execute when it is called using the () operator. Instead an iterator object is returned.
This iterator contains a next() method. The next() method of the iterator returns an object which contains a value property which contains the yielded value. The second property of the object returned by yield is the done property which is a boolean (which should return true if the generator function is done).
Example:
function* IDgenerator() {
var index = 0;
yield index++;
yield index++;
yield index++;
yield index++;
}
var gen = IDgenerator(); // generates an iterator object
console.log(gen.next().value); // 0
console.log(gen.next().value); // 1
console.log(gen.next().value); // 2
console.log(gen.next()); // object,
console.log(gen.next()); // object done
In this example we first generate an iterator object. On this iterator object we then can call the next() method which allows us to jump form yield to yield value. We are returned an object which has both a value and a done property.
How is this useful?
Some libraries and frameworks might use this construct to wait for the completion of asynchronous code for example redux-saga
async await the new syntax which lets you wait for async events uses this under the hood. Knowing how generators work will give you a better understanding of how this construct works.
To use the ES6 generators in node, you will need to either install node >= 0.11.2 or iojs.
In node, you will need to reference the harmony flag:
$ node --harmony app.js
or you can explicitly just reference the generators flag
$ node --harmony_generators app.js
If you've installed iojs, you can omit the harmony flag.
$ iojs app.js
For a high level overview on how to use generators, checkout this post.

Categories

Resources