JavaScript Generators

Chapter 14 38 mins

Learning outcomes:

  1. Background to generators
  2. What is a generator
  3. How generators work
  4. The yield keyword

Introduction

Undoubtedly, going over predefined iterable sequences such as strings, arrays, HTMLCollection lists is amazing and, in fact, a common concern of most applications.

But wouldn't it be even more amazing if we could somehow generate iterable sequences on-the-go.

No, no - we didn't mean to make given objects iterable by defining an @@iterator() method on them or generating an array or a string. Rather we meant to be able to generate a sequence without having to store it all at once, as otherwise happens in an array.

For example, suppose we want to work with a sequence of all even numbers upto one million, or all square numbers less than seventy thousand, or for that matter, all prime numbers less than a given number N.

In all these cases, we don't want to create the desired sequences all at once, but rather generate them one-by-one.

So can you think of any way to do this. Well, what we need is a generator....

What is a generator?

Before we begin, know that we won't answer this question by giving a rigorous definition of generators. Rather we'll start slowly and build up gradually towards the rigorous definition near the end of this section.

It's important for you to appreciate each aspect of a generator and be able to use that to formally define what exactly a generator is.

So let's begin...

Essentially:

Anything that has the capability of defining a sequence, is a generator.

Obviously, the 'anything' referred to here can only be a function since all other data types can't define a sequence.

This is because only a function can be called, passed in configurable arguments and therefore produce some output. No other data type can do this..

The following code shows a simple generator function. It takes an argument and generates a sequence of positive integers upto that argument:

function positiveInts(n) {
   var i = 1;
   var max = (n < 1 || typeof n !== "number") ? 1 : n;
   return {
      next: function() {
         if (i > max) return {value: undefined, done: true}
         return {value: i++, done: false}
      }
   }
}

To be very precise the function takes in an argument and returns an iterator object that will go upto that argument number.

The statement in line 5 checks for whether i has gone above max, and if it has, then returns the object {value: undefined, done: true}

If we create a seq iterator using the function above, it'll produce results shown as follows:

var seq = positiveInts(3);

seq.next(); // {value: 1, done: false}
seq.next(); // {value: 2, done: false}
seq.next(); // {value: 3, done: false}
seq.next(); // {value: undefined, done: true}

Note that the generated sequence doesn't exist anywhere at once. Instead, it is defined lazily i.e each value exists in memory one at a time. The iterator returned by positiveInts() is what allows for this kind of a behavior.

Always remember this idea - a generator simply defines a sequence with the help of an iterator object; it does NOT create the sequence all at once.

If a generator doesn't operate on an iterator then, by the specification, it isn't a generator and therefore not so cool!

Generators gotta be cool!

Remember that the word 'generator' has a special meaning in JavaScript which is discussed above and will be discussed in the section below.

If you have a function that generates something like an array, some HTML markup or whatever, that function can surely be called a generator (by definition of the word); but keep in mind that this generator won't be the generator JavaScript defines in its specification.

Now the generator we created above simply allowed us to generate any iterator object we like that can ultimately define a sequence of positive integers. However, the generator didn't allow for direct iteration on the sequence it defined.

What this means is highlighted as follows:

var seq = positiveInts(3);

// throws an error
for (var num of seq) {
   console.log(num);
}

See how we can't iterate over seq, despite the fact that it indrectly defines a sequence. The way we can't do so is because it hasn't got any @@iterator() method on it.

By specification,

A generator function defines not just a sequence, but rather an iterable sequence.

This means that whatever the generator returns has the ability to be iterated over, as well.

Now these iterable sequences aren't like strings, arrays, or whatever that otherwise exist all at a time - instead they are created on-the-go one-by-one, they do NOT exist all at a time.

In technical terms, we refer to this as lazy evaluation.

Lazy evaluation means that values are created only when they are needed, not all at once. This aspect shouldn't be any surprising to you - it's just how iterators work! You call the next() method and only then is the next item evaluated in the sequence.

The real question is how do we make a generator's return value iterable. Well, this is some heck of thinking and coding!

Now that we've covered all the theory of what a generator is, let's boil it all down to the rigorous definition we were talking at the start of this section.

Generators are functions that return iterators to lazily generate iterable sequences.

So hopefully, the idea of what is a generator is clear to you, by now.

Using generators, we can generate sophisticated sequences such as those for prime numbers, fibonacci numbers, and so on. We can even query down these sequences further to our needs, for example getting only the first five primes, or the first hundred fib numbers.

However, JavaScript doesn't somewhat like this complicated mess of coding generators in the old fashion. With ES6, it provides an easy bunch of tools and a convenient syntax to create generator functions that bring on a new model of execution into the language.

Let's dive into it...

Generators in JavaScript

As we've seen just right now, coding the logic of even a simple generator can be a coffee-requiring job, especially for those who are new to the advanced side of JavaScript.

To get an idea, review the code above. We are only trying to define a sequence of positive integers upto a number N, but nonetheless the code looks as if we are solving some next-level rocket science in it.

The way JavaScript mitigates this is by introducing a special type of function that's meant to serve the purpose of generating iterable sequences.

And you got it - it's called a generator function!

Generator functions are created just how we create normal functions with one slight addition - the function keyword is followed by an asterisk * symbol.

This is what distinguishes a generator function from a normal function.

Shown below is the general form of a generator:

function* functionName() {
   // function body
}
It isn't necessary to put the asterisk immediately after the function keyword. We can also put it after adding a space.

What's necessary, is to just have an asterisk symbol appear somewhere between the function keyword and the function's name — it's all upto you how many spaces you want to add in between!

However just this won't make any wonder. There is one crucial thing missing in this general form that lies at the heart of generators, and that is the yield keyword.

The yield keyword basically defines a value in the iterable sequence defined by the generator. Each yielded value is, in effect, a value of the sequence, in the order it appears in the generator.

Well there is a lot more to be discussed about the yield keyword, especially its pause behavior. But before all that let's first go through a very simple example.

Following we create a generator function to define the sequence 1, 3, 5, using the yield keyword:

function* sequence() {
   yield 1;
   yield 3;
   yield 5;
}
var seq = sequence();

seq.next(); // {value: 1, done: false}
seq.next(); // {value: 3, done: false}
seq.next(); // {value: 5, done: false}
seq.next(); // {value: undefined, done: true}

As you may agree, it isn't that hard to understand what's happening in this code.

Each yield keyword defines the next value in the sequence - and since we have three yield keywords we have three values in the sequence that are 1, 3 and 5 respectively.

Let's see what happens when we use this sequence in a for...of loop:

var seq = sequence();

// works!
for (var num of seq) {
   console.log(num);
}
1
3
5

It works! The for...of loop is able to iterate over the invoked generator function.

So what exactly is happening over here?

It's time to head over to the explanation team!

How generators work?

Let's review the code shown above:

function* sequence() {
   yield 1;
   yield 3;
   yield 5;
}

var seq = sequence();

And now let's dissect it....

When sequence() is called in line 7, the interpreter realises that it is a generator function's call and likewise returns an iterator right away - the function's body isn't executed to even a single percent.

How can we confirm this? Well it's not difficult...

function* sequence() {
   console.log("Started!");
   yield 1;
   yield 3;
   yield 5;
}

var seq = sequence(); // nothing logged

sequence() is called in line 8, but the console log in line 2 isn't made. This confirms that the generator function isn't executed immediately.

Rather an iterator is returned by a generator's call.

This returned iterator runs over the yield values one-by-one defined in the generator, each time its next() method is called.

In the section above, we consumed this iterator manually, by repeatedly calling seq.next() four times. As expected, we got three objects wrapping up the three values 1, 3 and 5, respectively in the first three seq.next() calls. In the last call, we obviously got the object {value: undefined, done: true}.

When the next() method is called, for the first time, on the iterator seq, execution begins inside the generator and goes upto the first yield keyword, at which point, it pauses.

Consider the code below:

function* sequence() {
   console.log("First!");
   yield 1;

   console.log("Second!");
   yield 3;

   console.log("Third!");
   yield 5;

   console.log("Done!");
}

var seq = sequence();

The moment we call seq.next(), here's what happens:

console.log(seq.next());

Refer to the code above ↑

First!
{value: 1, done: false}

Execution resumes inside the generator, starting from line 2. First console.log("First!") gets executed and then execution moves to the second line - where we have the yield keyword.

The moment yield is encountered, execution pauses, this point saved internally within the generator function, and finally the value 1 assigned to the value property of the object to-be-returned by the seq.next() call.

Since we haven't gone further to confirm whether we're out of yield, done is set to false.

Finally we get the object {value: 1, done: false} logged.

Moving on, when we call seq.next() the second time, here's what happens.

console.log(seq.next());

Refer to the code above ↑

Second!
{value: 3, done: false}

Execution resumes from line 2 - right from where it has paused previously.

We'll see how this defines another generator behavior shortly in the section below.

It continues on, comes across the log statement, makes the respective log "Second!" and then moves over to line 6 - where we have another yield.

The moment yield is encountered, execution pauses, this point, once again, saved internally within the generator function, and finally the value 3 assigned to the value property of the object to-be-returned by the seq.next() call.

Again, since we don't know whether we really are out of yield we assume we aren't done yet and therefore set done to false.

Finally we get the object {value: 3, done: false} logged.

Calling seq.next() the third time gives us the following result:

console.log(seq.next());

Refer to the code above ↑

Third!
{value: 5, done: false}

Execution resumes from line 6 - right from where it has paused previously.

It continues on, comes across the third log statement, makes the respective log "Third!" and then moves over to line 9 - where we have yet another yield.

As before, the moment yield is encountered, execution pauses, this point saved internally within the generator function, and finally the value 5 assigned to the value property of the object to-be-returned by the seq.next() call.

Since we don't even know yet whether we really are out of yield, we assume we aren't done and therefore set done to false.

Finally we get the object {value: 5, done: false} logged.

To end it all, when we call seq.next() the fourth (and last) time here's what happens:

console.log(seq.next());

Refer to the code above ↑

Done!
{value: undefined, done: true}

Execution resumes from line 9 - right from where it has paused previously.

It continues on, comes across the log statement, makes the respective log "Done!" and then, realising that the function has ended, goes out of the function after doing the following action.

The value property of the object to-be-returned by the seq.next() call is set to undefined and since the function sequence() has completed i.e we are out of yield, done is set to true.

Finally we get the object {value: undefined, done: true} logged.

A simple piece of code, but a massive amount of explanation for it!

This is the beauty of generators - they are simple in syntax, but way complicated in operation.

Things to note

Now there are a couple of things that we should take note of and learn from the long explanation above:

  • The next() method serves to resume execution inside a generator.
  • The yield keyword serves to pause execution inside a generator.

The yield keyword

Surely, as we've seen uptil now, yield has got some extraordianry behavior when dealt by the interpreter. Most importantly, it pauses execution and puts its following expression inside the value property of the corresponding next()'s iterator.

In the section above, we dissected this keyword very precisely, showing exactly how the execution cycles are paused and resumed inside the generator.

However, the example on which we based all our discussion was a very basic one, not sufficient enough to clarify much wierd things about yield.

Following we demonstrate a couple more examples before finally moving over to explore another behavior of yield that's closely tied with the next() method as called with an argument.

So let's dive into the examples....

Infinite sequences

Suppose we have a generator defining the sequence of all positive integers - from 1 to infinity. One way to define the generator is by using the for loop, as shown below.

function* positiveInts() {
   for (var i = 0; true; i++) {
      yield i;
   }
}

var seq = positiveInts();

Since the sequence goes upto infinity, there isn't really any boundary at which we must stop; likewise the check part of the for loop is set to true. The loop must never end, and so have a condition that always evaluates to true.

Such type of a loop is commonly known as an infinite loop.

Now when we call seq.next() for the first time, execution officially begins in the generator; goes upto the for loop; declares i as 1, performs the check (which is trivially always true) and then jumps to the loop's body.

Here it encounters the yield keyword and therefore evaluates the expression following it i.e i, which resolves to 1.

Once evaluated, execution pauses and this is number 1 is set as the value of the object represented by the seq.next() call, we made above.

console.log(seq.next()); // first time
{value: 1, done: false}

Notice, that although the for loop was infinite, yield causes execution to break out of it, and so prevent the program from crashing.

The loop only iterates once (in fact even the first iteration wasn't completed) after we call seq.next() - execution runs into the loop and goes only as far as line 3 before, eventually, getting paused and brought out of it.

Regardless, this breaking behavior wouldn't have any effect on the loop as a whole. Calling seq.next() on subsequent occasions will resume execution right exactly from the point it has paused before, and thereby keep the iteration in normal flow.

For example, if we call seq.next() the second time,

console.log(seq.next()); // second time

execution would resume right from the previously-paused point (the highlighted line below) and go on to complete the first pending iteration.

function* positiveInts() {
   for (var i = 0; true; i++) {
      yield i; // execution resumes right from here
   }
}

Once the first iteration is complete, i++ is executed before putting the loop's body in the execution process once again.

And as before, the first encounter in the loop's body is of the yield keyword; which evaluates the expression following it, sets value of the corresponding next() object equal to it, and finally pauses execution.

Thus we get the following log:

{value: 2, done: false}

And this can go upto infinity!

Well obviously, values will start to become Infinity once the number Number.MAX_VALUE is exceeded. In fact, they would start to become meaningless way before than this i.e when the maximum safe integer is exceeded, represented by Number.MAX_SAFE_INTEGER

You could use BigInt() as an alternative, but in keep in mind its pitfalls!

Now because of the fact that this sequence doesn't end anywhere, using the Generator seq in a for...of loop or in a spread operator - where an iterable is expected - would ultimately crash the browser.

This is due to the fact that the for...of loop iterates upto the point the iterator doesn't spit out an object with done equal to true. Since in the generator example above, no such point comes, the internal loop performed by for...of would never end and be the reason for the browser ultimately crashing!

This is advanced JavaScript!

Arguments to next() - yield taking values

Moving on, there is still one extremely important aspect of yield left to be discussed that you must know in order to appreciate what's happening in the following code:

function* gen() {
   yield yield 10;
}

var seq = gen();

seq.next(); // {value: 10, done: false}
seq.next(30); // {value: 30, done: false}
seq.next(); // {value: undefined, done: true}

Let's see what's the deal....

Each time an argument is provided to the next() method, it replaces the whole corresponding yield expression with that argument.

Keep note of the word 'corresponding' here - it tells us that the replacement only occurs if the next() call resumes execution inside the generator from the point a yield had paused it previously.

This means that since the first next() call doesn't resume execution from a previously-yielded point (rather it resumes execution from the start of the generator), it won't perform any replacements.

Verily, understanding yield means understanding very tedious details!

Consider the following code to get an idea of what we mean:

var x;

function* gen() {
   x = yield 30;
}

var seq = gen();

Before we explain this, try to deduce the sequence define by the generator function gen() here.

It's simply just one value - 30, which means that the iterator returned by calling gen() will complete right after the first next() call.

Anyways, let's now resume on the explanation part...

When we call seq.next() (for the first time), here's what happens (you should be able to tell this on your own by now!):

  1. Execution begins in the generator.
  2. An assignment expression is encountered and thus evaluation is started from the right-hand side of the = sign.
  3. A yield is met and therefore the expression following it is evaluated. (Go above and see what's the expression!)
  4. The result of this evaluation, which is trivially 30, is fed into the next()'s value property and finally execution is paused.

The first seq.next() invocation returns the object shown below:

{value: 30, done: false}

After this, calling seq.next() the second time, with an argument, is what all the fuss is about!

seq.next(10); // second time

As we can judge, this call will return the object {value: undefined, done: true} since we are out of yield; in fact at the end of the gen() function.

However, this judgement isn't the real deal here - the real deal is the value of the variable x which turns out to be 10, after this second seq.next(10) call completes.

How did it become 10? Well it's an interesting story...

The moment seq.next(10) is called for the second time, execution resumes from the highlighted part below:

function* gen() {
   x = yield 30;
}

Since this part denotes a yield, the entire highlighted expression is replaced by the value 10 and consequently assigned to the global variable x.

You can think of it this way:

function* gen() {
   x = yield 30 10;
}
After the assignment, the function gen() completes and therefore we get the object {value: undefined, done: true} returned by the second seq.next() call.

Thus, after the second seq.next() call, x is equal to 10.

This is the power of yield - it can use the argument passed to the next() method to define the next value in the sequence.

Isn't this superb?

In legacy implementations, this idea of passing a value to next() was carried out by a method send(). It behaved exactly how the modern next() method works. However, it's deprecated as of now!

Try solving the following task for a solidification of your understanding of the yield keyword.

Clearly state the objects that will be returned by the first, second, third and fourth seq.next() calls shown below.

function* gen() {
   var a = yield 10;
   var b = yield a + 5;
   yield b;
}

var seq = gen();
// what will each of these return?
seq.next(15);
seq.next(60);
seq.next(32);
seq.next(4);

You must give the answer for each call in the form {value: someValue, done: isDone}.

// {value: 10, done: true}
// {value: 65, done: false}
// {value: 32, done: false}
// {value: undefined, done: true}

The first call seq.next(15) starts execution inside the generator. Since it doesn't resume execution from a previously-yielded point, it obviously can't replace any of the yield keywords in the function's definition above and so the argument 15 is just redundant!

Execution goes uptil the first yield, which is the expression yield 10; hence we get the object:

{value: 10, done: false}

After this, the second call seq.next(60) resumes execution from a previously-yielded point and so it has the ability to replace the corresponding yield expression.

The expression yield 10 is replaced by the value 60 and so the variable a ultimately gets set to 60. Going ahead, in line 3, b is declared and then its assignment expression evaluated. Here a yield is met, which is yield a + 5.

a is 60, so a + 5 is 65 which means that yield a + 5 resolves down to yield 65; therefore we get the object:

{value: 65, done: false}

The third call seq.next(32) resumes execution right from this yielded point and thus replaces the whole expression yield a + 5 with the value 32. This leads to b being set to 32. Going ahead, a yield is met in line 4 which is yield b.

This expression resolves down to yield 32 and therefore we get the object:

{value: 32, done: false}

The last call seq.next(4) resumes execution from this yielded point, replacing the entire expression yield b with the value 4. However, because this last statement has no side-effects, it's ignored and so the generator function eventually reaches its end.

Therefore we get the object:

{value: undefined, done: true}