Course: JavaScript

Progress (0%)

JavaScript Arrays - Sparse Arrays

Chapter 30 31 mins

Learning outcomes:

  1. What are sparse arrays and dense arrays
  2. Five ways of creating sparse arrays
  3. The confusing length property
  4. Accessing an empty slot of a sparse array
  5. Checking for an empty slot
  6. Internals of sparse arrays

Introduction

JavaScript is full of suprises every now and then. The more of these surprises that we know, the better we can stand as a JavaScript developer, capable of debugging a given program in a short span of time.

Anyways, the surprise that we're here to discover arises when working with arrays. It relates to the idea of sparse arrays.

If we don't know of sparse arrays and when exactly do they surprise us in given programs, we can find ourselves in difficult-to-debug situtations. Sparse arrays represent an idea as well as an implementation detail of JavaScript engines.

In this chapter, we shall learn what exactly are sparse arrays and their complements, dense arrays; how do we (unintentionally) create sparse arrays; how sparse arrays work under the hood to optimize memory usage; and much more on this road.

What are sparse arrays?

In the first chapter of this unit, JavaScript Arrays — Basics, we learnt about two variants of the Array() constructor. One, where we provide a single integer argument to Array() is of particular importance for this chapter.

Array(n), where n is an integer, creates an array with n empty slots, sometimes also referred to as 'holes'.

In the following code, we demonstrate this:

var arr = new Array(10);

console.log(arr);
[empty x 10]

Notice the console's output — it clearly isn't something we're used to seeing often. Isn't that so?

empty x 10, what exactly is that?

Well, empty x <some number> is the browser's way (in the case above, Google Chrome) of indicating to the user that a given array has empty slots in it. These slots are completely non-existent; don't make the mistake of thinking that they're undefined.

They make the array what's commonly referred to as sparse:

A sparse array, also known as a holey array, is an array where elements don't exist at contiguous positions; they have empty holes between them.

Making intuition of a sparse array is a little bit difficult from new Array(n); a better way to do so is by creating an array as usual and then deleting an element from it using delete.

A quick explanation of the delete keyword

The delete keyword is used to delete a property from an object in JavaScript. We'll cover it extensively in the JavaScript Objects — Basics chapter.

Its syntax is quite basic:

delete obj[prop]

The keyword is followed by the object whose property we wish to delete, followed by the property's name prop in bracket notation.

We can also use dot notation — delete obj.prop It's just that bracket notation can't be used with array indexes, which are integers. When using dot notation, remember that property names can't begin with digits.

In the case of arrays, we can use delete to delete an element from an array.

delete arr[index]

However, using delete creates empty holes in arrays and doesn't modify its length property (as we shall discover soon in this chapter). Even though it does remove an element, it's not recommended to be used for this reason.

delete is intended for and better suited to objects that have the more conventional notion of key-value pairs, for e.g. { x: 10 }.

Below is an example of using delete to create a sparse array:

var arr = [1, 2, 3];
delete arr[1];

console.log(arr);
[1, empty, 3]

As we call delete arr[1], notice what happens to the array — the index 1 becomes empty. There's absolutely nothing at the index any longer. It's a completely empty slot.

The elements of arr no longer sit at contiguous positions. The first element is at index 0 (as is the norm), but the second one is at index 2. Ideally, the second element should've been at index 1, in that case, the array would've been termed as dense.

A dense array, also known as a packed array, is an array where elements exist at contiguous positions, without any empty holes in between.

When elements are packed together with each other in an array, we have a dense array. And as you can probably guess, dense arrays are the exact opposite of sparse arrays.

In the code above, the array arr we have in the beginning:

var arr = [1, 2, 3];

is a dense array; each subsequent element comes right after the previous one in the array.

But the moment we call delete on this dense array, and consequently introduce a hole into it, it becomes sparse.

Alright, at this point, we know what is a sparse array and how it compares with a dense array. The next logical thing to do is to see how do we get sparse arrays.

Creating sparse arrays

There are five ways of getting sparse arrays in JavaScript:

  • Omitting elements in array literals
  • Calling new Array(n), where n is an integer
  • Removing an element using delete
  • Assigning to an index beyond the array's length
  • Manually increasing the value of the array's length

Omitting elements in array literals

The easiest way of creating a sparse array is to omit elements when typing out an array literal.

This omission happens in the form of having two commas (,) next to each other without any element defined in between; the lack of an element effectively denotes an empty slot.

var arr = [1, , 2];

console.log(arr);
[1, empty, 2]

Calling new Array(n)

We've already seen this in the previous chapter and also in the section above. Let's review it once more.

When new Array(n) is called, where n represents an integer (more specifically, a non-negative integer), we get a sparse array with n holes in it.

Here's a quick example:

var arr = new Array(10);

console.log(arr);
[empty x 10]

Perhaps the reason why JavaScript treats such a call to produce a sparse array is for memory usage optimization — calling new Array() with a very large integer, e.g. new Array(5000000), shouldn't unnecessarily allocate space for that many elements.

Removing an element using delete

The reason why it's recommended to avoid using delete for removing array elements is because of its production of a sparse array.

That is, deleting an array's element using delete introduced an empty slot at the corresponding index and thus transitions the array to a sparse array.

Following is a quick example:

var arr = [1, 2, 3];
delete arr[1];

console.log(arr);
[1, empty, 3]

Note that even removing the last element using delete would end up with the same effect, making it an empty hole as well:

var arr = [1, 2, 3];
delete arr[2]; // Removing the last element

console.log(arr);
[1, 2, empty]

Assigning to an index beyond the array's length

As stated in JavaScript Arrays — Basics, JavaScript allows us to assign a value to any arbitrary index in an array.

Personally, I feel that this shouldn't be allowed in a language — it seems a very sensible decision to prevent assigning to an index beyond the last index — but at least it is in JavaScript, and so we must mould to it.

Now, when we assign a value to an index that equals the length of the array, we're good to go — there's no need to worry about transitioning to the sparse arena:

var arr = [1, 2, 3];
arr[arr.length] = 10; // No need to worry at this

However, when we assign a value beyond the length of the array, we should be wary of the fact that this would take us into the sparse arena:

var arr = [1, 2, 3];
arr[5] = 10;

console.log(arr);
[1, 2, 3, empty x 2, 10]

Frankly speaking, there isn't much utility, if at all, of such a kind of code.

Why would anyone want to assign a value to an arbitrary index in an array beyond its length? There should be, and always is, a better way to deal with the given problem at hand than assigning to an arbitrary array index.

Increasing the value of the array's length property

The length property of a JavaScript array isn't read-only; we can assign an integer to it to shrink or expand the underlying array to the given length.

When we decrease the value of length, the array is reduced to that length, with extra elements removed from the end:

var arr = [1, 2, 3];
arr.length = 1;

console.log(arr);
[1]

However, when we increase length, of course the array is expanded to that length, but most importantly, this expansion introduces empty holes into the array and consequently makes it sparse:

var arr = [1, 2, 3];
arr.length = 4;

console.log(arr);
[1, 2, 3, empty]

Once again, there isn't much utility to manually increasing the value of an array's length property. We're better to stick to using conventional approaches — push() and splice() — to increase an array's length, and that when we have data to add to the array.

The confusing length property

By this point in this course, we are quite comfortable with a thorough understanding of the length property of arrays in JavaScript. Or are we? Turns out, we aren't!

We know that the length property returns the total number of elements in an array. That's quite far from the truth the moment we try to apply it to sparse arrays.

Consider the following:

var arr = [1, 2, 3];
delete arr[2];

console.log(arr);
console.log(arr.length); // We expect the length to be 2
[1, 2, empty] 3

After calling delete arr[2], we expect the length of arr to reduce down from 3 to 2 because we've removed an element, and most importantly the last element. However, arr.length stays the same — 3.

This is because when we introduce empty holes into an array by removing stuff, the length property doesn't change.

However, when we introduce holes by increasing the length of the array, or assigning to a far-fetched index, length changes to account for all those empty slots:

var arr = [1, 2, 3];
arr[10] = 100;

console.log(arr);
console.log(arr.length);
[1, 2, empty x 8, 100] 11

In the code above, even though we've added just one element in arr[10] = 100, since it introduces empty slots into the array, the length goes up to 11 to account for all these empty spaces plus the newly added item, 100.

JavaScript might be easy but it's not at all easy. (The irony says it all.)

A characteristic common to all sparse arrays in JavaScript is that they have less elements than their length.

Honestly, there is no pitch-precise definition of the length property of arrays.

Some resources claim that it's one greater than the last index of an array but in the case of new Array(n), where there is just no 'last index', this doesn't hold.

However, it's perfectly safe for us to continue with the same old definition that the length property holds the length of an array, i.e. the total number of elements in it.

This is because sparse arrays are the only place where this definition doesn't hold and it's generally not recommended to work with sparse arrays (for reasons which we'll cover later on in this chapter).

Accessing an empty slot

It's all great to know that we can introduce holes into arrays in JavaScript, that there are five ways of doing so, and this and that.

But what exactly do we get when we try to access any of these empty slots?

Well, JavaScript returns undefined when we access an empty slot in an array.

Consider the following demonstration:

var arr = new Array(10);

console.log(arr[0]);
console.log(arr[1]);
console.log(arr[9]);
undefined undefined undefined

We access the first, second, and tenth indexes; each one being an empty slot ends up returning undefined.

It's superbly important to note at this point that although accessing an empty slot returns undefined, undefined does NOT actually exist at the given position; an empty slot is, well, empty.

Ideally, JavaScript should throw an error when accessing an empty slot. In fact, it shouldn't even allow us to introduce empty slots in an array, in the first place.

This means that if for some reason we need to determine whether a given index represents an empty slot or not, relying on the returned value of accessing that index is useless — we'll get undefined if the slot is empty or defined with an explicit undefined value:

Consider the following code:

var arr = [1, undefined, , 2];

The second index of the array is explicitly tied to undefined whereas the third index represents an empty slot. When we obtain either of these indexes, we get the same value, undefined, in return:

arr[1]
undefined
arr[2]
undefined

So does this mean that there is absolutely no way to be able to determine whether an index represents an empty slot in JavaScript or not?

No. We can very easily check whether an index corresponds to an empty slot in an array or to an actual element. This requires us to either use the in operator or the hasOwnProperty() method.

Checking for an empty slot

If a given index of an array corresponds to an actual value, be that undefined, we can easily figure that out with the help of the in operator or the hasOwnProperty() method.

Here's how to use either of these:

The in operator

The in operator, which we'll discuss in way more depth in the upcoming JavaScript Objects unit, allows us to check whether a particular property is accessible on a given object.

In this case, because an array index is merely a property mapping to a value, we can determine if an index actually exists on an array by leveraging the in operator on the index.

Let's see an example.

In the following snippet, we have the same array as above, with an explicit undefined value in it, followed by an empty slot:

var arr = [1, undefined, , 2];

The goal is to be able to distinguish between both these. Fortunately, in allows us to do so very easily:

var arr = [1, undefined, , 2];

console.log(1 in arr);
console.log(2 in arr);
true false

1 in arr returns true since the index 1 (i.e. the property 1) is accessible on the array. In contrary to this, 2 in arr returns false since the index 2 doesn't exist on arr; recall that the index 2 denotes an empty slot.

And in this way, we can easily determine whether or not a given index represents an empty slot in an array.

Array indexes are treated as strings

Keep in mind that all properties in JavaScript are treated as strings, and that array indexes are properties at the end of the day.

So in the code above, even though we've written 1 in arr, JavaScript treats it as '1' in arr. There is nothing such as integer properties in JavaScript; all properties are strings.

To emphasize on this very nature of JavaScript, we can rewrite the code above as follows:

var arr = [1, undefined, , 2];

console.log('1' in arr);
console.log('2' in arr);

The result will be the same:

true false

As we shall find out later in this course, in precisely checks for the given property on the given object and then on all of its prototypes until it finds a match.

For confirming that an array index doesn't represent an empty slot, we can well use in but it might be too much; we don't really need to go over the prototype of the array for the check. A better way is to use the hasOwnProperty() method.

The hasOwnProperty() method

As the name suggests, the hasOwnProperty() method allows us to determine whether an arbitrary object owns a given property.

Owning a given property means that it exists directly on the object itself (and not necessarily on its prototype).

For instance, the object { x: 10 } owns the property x since it is defined directly on it. However, it doesn't own the property hasOwnProperty because it gets this one from its prototype, Object.prototype.

Don't worry if you're getting confused about prototypes; we'll cover them in detail in the next unit of this course, particularly in JavaScript Objects — Prototypes.

Since hasOwnProperty() only looks for own properties, when we use it on an array index, it'll only look for the index (the property) on the array itself; no probing on the prototype of the array.

Shown below is an example similar to the one we saw earlier:

var arr = [1, undefined, , 2];

console.log(arr.hasOwnProperty(1));
console.log(arr.hasOwnProperty(2));
true false

First we get true because arr[1] exists, and then we get false because arr[2] doesn't exist.

As stated earlier, since JavaScript treats array indexes as properties, and all properties as strings, arr.hasOwnProperty(1) becomes arr.hasOwnProperty('1').

opt Internals of sparse arrays

In this section, we shall get a little deeper into the internals of sparse (and dense) arrays in JavaScript, discussing what exactly happens under the hood when an array is sparse and when it's dense.

As always, we'll try to keep our exploration generic and limited to basic ideas, keeping from delving too much into the implementation details of a particular JavaScript engine.

Arrays are typically implemented in programming languages as a sequence of contiguous blocks in memory. For example, if we have an array of three numbers, we'll have three blocks of memory literally next to each other.

The benefit of this approach is that accessing a particular index becomes a constant-time operation.

How? Very interesting: the array itself resolves to the address of its first element's memory block; thereafter, whichever index we access, it's added to this memory block's address to get to the memory block of the desired element.

This works because memory addresses in computers are merely integers.

This is a highly abstract overview of how arrays work in programming languages, for a very detailed description is out of the scope of this chapter.

When arrays are dense in JavaScript, engines often resort to such an implementation for them. That is, a dense array in JavaScript is implemented internally as an actual array in memory.

But when it becomes sparse, engines tend to shift to a completely different implementation, that of a hash table.

The high-level overview of a hash table is that it's a map of key-value pairs. Each key is converted into a hash which is then used to get to its corresponding value in memory.

A hash table doesn't have to be stored at contiguous locations in memory unlike an actual array.

Immediately by reading this, you might have thought about objects in JavaScript in that they're also maps of key-value pairs. Turns out that objects in JavaScript are also implemented internally as hash tables.

Anyways, coming back to the topic, a sparse array is made into a hash table because otherwise representing a sparse array as an actual array in memory would be quite wasteful, especially if the length of the array is huge.

But while a sparse array might save on some memory, it doesn't come without a cost.

Accessing an index on a sparse array internally boils down to accessing a key on a hash table. And unlike accessing an index on an actual array in memory, this isn't a constant-time operation; there's hashing to be done in addition to some overhead probing before we can get to the key's corresponding value.

This means that accessing elements on a sparse array is slower as compared to doing the same thing on a dense array.

All in all, this trek of the internals of JavaScript can be summarized as follows:

  • Dense arrays is the norm in JavaScript and rightly so; they are implemented as actual arrays in memory which are quite fast and efficient to work with.
  • We should almost always avoid creating sparse arrays. If we have very large arrays and find ourselves querying elements every now and then, using sparse arrays can lead to poor performance.

Further reading

Following we've gathered a couple of resources for you to learn more about sparse arrays if you're curious to get to know more about them: