Node.js Introduction

Chapter 1 21 mins

Learning outcomes:

  1. The problem that Node.js solves
  2. What exactly is Node.js
  3. The architecture of Node.js
  4. Features of Node.js

Many concurrent connections...

The inception of the World Wide Web in 1989 paved the way for computer scientists to eventually experience a whole new world full of novel computational problems to solve.

With time, languages exclusively meant to handle the server side of a web got created, including PHP and Perl, and so did technologies catering to the client part of it, including JavaScript.

Slowly and gradually, as people started to realize the immense potential stored in the web, perhaps as energy in a small uranium granule, web technology evolved and web application complexity rose.

Sooner than one could've ever imagined, web applications were paralleling the rigor and power of desktop applications. Web development quickly became a fully-blown profession, just like game development. Languages were being built, frameworks were being churned, libraries were being contributed to by people from all over the world, hardware tech was going up — in short, it was an era of some staggering tech advancements.

Talking specifically about server-side technologies, we had Java, Perl (with Apache) and PHP (with Apache) amongst the most popular options to get a server up and running.

Initially, because no one ever thought that the web was, in the coming days, going to be used at such an astoundingly high frequency, these technologies were designed to cater to different needs of applications in different ways than what would soon become a mainstream demand.

These technologies became good at dealing with databases. They became good at executing computationally expensive routines. They became good at performing templating for serving HTML dynamically. They became good at serving dynamic media, such as video. They became good at many things.

But of course, they weren't excelling in all departments.

We're talking about scalability here, and scalability specifically in terms of network I/O (input/output).

Scalability of a computer program refers to its ability to be shifted from being used a relatively small number of times to a huge number of times without adjusting a lot of software or hardware.

Scalability 'in terms of network I/O' refers to scalability in how many clients can connect simultaneously to an end server over the network.

As an example to better understand this argument, let's say you create a social media app today and you develop it with at most a hundred people logging simultaneously in your mind.

Great. It's still going to be a difficult streak to get done with such a complex undertaking as a social media app even if you keep the max concurrent connections limit very very low.

But imagine that in the span of less than a week following its deployment, your social media app becomes viral and starts to garner a million hits every second. Just imagine that.

What would happen?

Well, in the worst case, your app would crash and you'll eventually be losing a lot of your precious users as they see 505 errors or 'Connection timed out' errors on their screen.

Your existing tech stack, comprised of Java or Apache (paired with PHP or another scripting language) would only add to the problem rather than solving it. It would demand humongous hardware resources from your server setup for being able to manage all these million users at the same time, efficiently.

In simple words, it would demand you to think about upgrading your hardware system.

But before upgrading the server setup, one might ask whether the current hardware resources were being used efficiently or not by the software in the very first place? After all, you can't put your money on software to address a problem it was never ever designed to handle.

Isn't that so?

Coming back to the million-user nightmare of your social media app (although, it's wondrous that your app became viral!), if you used a technology that efficiently utilized hardware resources to concurrently manage a million users logging in at the same time, you might've never needed to worry that much about scaling up your server hardware.

The problem is that many server-side technologies of the past — including the behemoths Java and Apache — were NOT designed to solve such high-concurrent-connections problems from the ground up efficiently in terms of consuming computational resources.

They are reliable only to a certain extent, beyond which they start to 'eat up' computational resources.

So is there anything that was built to handle such high-concurrent-connections problems efficiently? Is there anything that thinks outside the box?

Yeah! Enter Node.js.

What is Node.js?

It's highly likely that in today's time, you've already heard of the name Node.js. And it's even more likely that you're confused about what exactly is it.

Is Node.js a language? Is Node.js a framework? Is Node.js a library? Is Node.js a flavor of pizza?

Well, let's accept it — we've all asked one of these at some point to ourselves. And now, it's time to answer them. (Before anything, Node.js certainly isn't a flavor of pizza!)

We'll begin with formally defining Node.js and then explain what the definition means:

Node.js is a JavaScript runtime environment, meant to execute JavaScript outside the browser.

There are two main components to this definition: 'JavaScript' and 'runtime environment'. We already know about JavaScript so what's left to explore is a runtime environment.

The following snippet expands upon this.

What is a runtime environment?

As the name suggests, a runtime environment is a computing environment where some kind of a computer program is run, alongside some external APIs.

Node.js is a runtime environment in that it's a computer program that is capable of running another computer program in it. This computer program that could be run in the Node.js runtime is written in the scripting language JavaScript.

Thus we say that Node.js is a JavaScript runtime environment.

Browsers also have their own runtimes. For example, the runtime of Mozilla Firefox is commonly known as SpiderMonkey (and it's also the engine parsing and executing JavaScript therein).

So Node.js is NOT a language — the language is JavaScript, the same one we use inside browsers (which, as we just learnt, have their own JavaScript runtime environments).

Similarly, Node.js is NOT a JavaScript framework. As we shall soon learn, Express.js is a JavaScript framework powered by the Node.js runtime.

And needless to say, Node.js is NOT a library either.

Restating it again, Node.js is a runtime environment that is meant to run JavaScript programs outside of the browser.

Node.js was created by Ryan Dahl and first released to the public after a small conference talk by Ryan himself in 2009.

This talk, although quite old by now, is still worth checking out. In the talk, Ryan emphasizes on the key motivations that eventually led to the birth of Node.js and the way it has been designed from the ground up, solving problems that technologies of the past couldn't, efficiently.

Let's now see how Node.js has been designed to address the problems that we discussed at the start of this chapter. It's time to explore the architecture of Node.js.

The architecture of Node.js

Node.js is an asynchronous, event-driven JavaScript runtime environment.

Two crucial words here are 'asynchronous' and 'event-driven'. Following the very design model of JavaScript, which is asynchronous, Node.js embraces an asynchronous and event-driven execution model from the get go.

Unlike technologies such as Java or PHP, we all know that JavaScript in the browser runs on a single thread, with I/O operations (such as XHR requests) handled asynchronously without blocking the main thread. Node.js doesn't improvise in any way in this execution model of browser JavaScript.

Single, main thread

That is, Node.js follows the very way JavaScript runs in a web browser — it has a single, main thread where all the execution happens, with I/O and other time-consuming operations handed off to be executed in the background and then once they complete, to be marked complete via a callback.

This fact, although understood by every (browser) JavaScript developer, is so important that it's worth restating.

Node.js executes JavaScript code on a single thread, offloading I/O operations and other time consuming tasks to be performed on separate threads and then executing callbacks (and event handlers, which are also callbacks) when those operations reach completion.

Now as soon as the veteran software developers coming from Java and C++ backgrounds hear this, they might get star-struck for a while.

Single-thread? They'd argue that this is a bad decision. What if there is a process that takes a lot of time — like seriously, a lot of time — to execute?

Well, in that case, Node.js would be blocked, incapable of executing anything.

That's right — anything!

But isn't that bad design? Hmm. You can't expect a piece of software to shine in every single aspect. There is a cost to everything. Node.js excels at giving us a high throughput, and this comes at the cost of computational speed.

Where Java might excel in performing a Fibonacci number calculation when compared to JavaScript, it won't be able to efficiently allocate resources to a million simultaneously connecting users of our social media app.

Coming back to our original discussion, because Node.js utilizes a single, main thread for executing functions, this reminds us of another important, related concept: the event loop.

Event loop

Internally, the Node.js runtime is also comprised of an event loop that orchestrates all executions therein.

It manages the main thread, the call stack, and other important parts necessary for bringing the asynchronous model of JavaScript to life. More details of the event loop, albeit from the perspective of the browser, can be seen in the chapter Advanced JavaScript — Event Loop.

Node.js uses the C library libuv (commonly pronounced as 'lib-you-vee') internally for its event loop.

The V8 engine

Moving on, an extremely important thing to note regarding Node.js is that this runtime environment is powered by the blazing-fast V8 engine, developed by Google.

V8 is a JavaScript engine that parses and executes JavaScript code. It is developed by Google and is the same engine that runs behind the scenes in the world's most popular browser, Google Chrome.

In addition to this, V8 is open-source, and so is Node.js.

Node.js takes V8 out of the bounds of Google Chrome and into the bounds of an unbounded system where developers from all over the world can contribute to and build an ecosystem of JavaScript utilities. The reason for Node's success can undoubtedly partly be attributed to the speed and robustness of V8.

The JIT-compiled nature of V8 makes running JavaScript on the server via Node.js quite a practical thing. In fact, the computational speed of a web server running in JavaScript on Node.js is much higher compared to that of a server running in Python (on its own runtime environment) or in PHP (on Apache).

The speed still can't match that of C/C++!

Still though, the speed of a server coded raw in C or C++ would be much much higher than the one spun up using Node.js.

However, the complexity of creating a server in C/C++ is actually even much higher than its speed advantage and for that reason, almost no complex web app development happens in C/C++.

Numerous frameworks, languages and other technologies have been written over the years to get up and working with a server in a nice, wrapped way, abstracting away the intricate complexity of C/C++ programming.

Features of Node.js

Open-source

As with many software nowadays, Node.js is an instance of OSS (Open-source software).

Node.js is an open-sourced JavaScript runtime environment. There is a huge community of active developers from all over the world, contributing to the Node repository on GitHub and improving it day by day.

That's the beauty of open-source software: everyone takes part in building a piece of software, in the best possible way.

Being open-source also means that Node is free to use for anyone.

The engine that is leveraged in Node.js to execute JavaScript, V8, is also open-source.

Blazingly fast

Running on the behemoth V8 engine, Node.js naturally enjoys blazingly fast execution speeds for JavaScript programs.

This means that Node.js is more than just a practical solution for building servers of almost all kinds, ranging from ones that need a high throughput (for chat applications, multiplayer games, etc.) to ones that need some considerable computational power (such as for manipulating images or running numerical algorithms).

While it's true that there are strong contenders to Node.js in the realm of execution speeds for computational problems that require resource-intensive operations (such as cryptography hashing protocols), with the ability to leverage C++ addons in Node.js there are ways to largely resolve performance issues where they arise in Node.js

Built-in package management

Soon, after the release of Node.js, came into existence a system for managing third-party modules natively in it. It was termed as npm to mean Node Package Manager.

Surprising as it may sound, npm is NOT an acronym for Node Package Manager, likewise make sure to NOT refer to it as NPM!

npm is Node's native package management system where we can install third-party packages for common tasks, such as creating HTTP servers, bundling JavaScript code, and where we can even publish our own packages.

We'll discover npm in detail in a later chapter in this unit.

Cross-platform

There's nothing more depressing than knowing that our favorite software isn't available on another operating system that we use. Think Adobe Illustrator not available on Linux. Or maybe, Redis not available (natively) on Windows.

Fortunately, when it comes to Node.js, thanks to all these years of hard work done by the active open-source community, Node.js can work on almost all major operating systems. The most notable ones are Linux, Windows, and Mac.

Being cross-platform means that Node applications can easily be ported across a multitude of operating systems without much different behavior, if at all.

Software portability is a luxury for a developer and thankfully, Node.js allows developers to enjoy it from the very beginning.

Create a Node program for Windows, run it on Linux, update it on your Mac, and then maybe run it on your university computer running BSD. Quite amazing, isn't it?

"It's all JavaScript"

By far, if there were to be one thing we had to name as the shiniest feature of Node.js, it would be that it allows developers to write JavaScript on the server.

Did you hear that? Yes, JavaScript.

Back in the day, if a frontend developer wanted to go full-stack, trying to be partly a frontend dev and partly a backend dev, this meant that there was a need to learn a complex server-side language such as Java or Perl to be able to engage in backend development.

Later on, server-side languages did become somewhat simpler but still the learning curve was much steeper than that for people working on the frontend.

Added to this the fact that there was a constant need to switch contexts — writing dynamically-typed JavaScript for the browser at one point, then shifting to writing statically-typed Java for the server in another — it meant that becoming a fullstack developer was fraught with a certain degree of unnatural reluctance.

If you've ever tried switching between different languages, you'd be able to relate to this. While not difficult, it's not that easy either. Pair this notion with the fact that a beginner has an even harder time understanding the syntax of one language, let alone that of two.

Ryan Dahl, the creator of Node.js, thought that if every web developer has some clue of JavaScript, why not take JavaScript into the server as well, and thus keep people from having to learn another language specifically for the backend (at least to begin with).

And that's exactly what he did.

He took the open-source V8 engine, added a couple of bindings to it that made sense for a web server environment (such as reading/writing files, talking over HTTP, etc.) in contrast to a browser environment, and voila! Node.js born.

What this meant was that now as soon as a beginner got done with learning the third important technology in his step into the world of web development, i.e. JavaScript, after HTML and CSS, he/she could quickly start writing web servers in less than a day or so!

And that's jaw-droppingly amazing!

Clearly, there is still a learning curve for the different APIs in Node.js that are exclusively meant for the server, but at least the language's syntax and semantics are already in our minds, saving us some considerable amount of time in getting started.

Running on JavaScript also means that JavaScript utilities written for the browser can be re-used in Node, obviously given that the features used in those utilities are available on both the browser and the server (for e.g. it doesn't make sense to re-use DOM utilities on Node simply because there is no DOM on the server).

Node.js isn't a pioneer of server-side JavaScript!

Contrary to what one might assume, Node.js isn't the pioneer of taking JavaScript into the server.

This happened almost 13 years before Node's inception in 1996 by the same company that invented JavaScript — Netscape. Netscape released LiveWire, a web server running on JavaScript, the next year after launching JavaScript in their Netscape Navigator 2.0 browser.

However, the project didn't gain any significant traction so as to be alive to this date, and got abandoned in less than a year or so.

This was perhaps because JavaScript was still a nascent technology that many people didn't come to appreciate and also because there weren't any robust JavaScript engines back then to run it on such a complex system as a web server — JIT-compiled JavaScript might've not even been an imagination back then!

"I created Codeguage to save you from falling into the same learning conundrums that I fell into."

— Bilal Adnan, Founder of Codeguage