A quick recap
Let's start by quickly reviewing all the concepts related to strings that we've learnt thus far in this course.
First of all, here's what a string is:
In a program, a string is merely a piece of textual data, typically denoted in code using a pair of single quotes (''
) or a pair of double quotes (""
) to distinguish it from actual code.
Here are some examples of strings assigned to variables in JavaScript:
var greeting1 = 'Hello World!';
var greeting2 = "Hello World!";
var saying = 'He said: "I love you!"';
var quote = "It's the last day to submit the project.";
The first two strings hold the exact same text, just denoted in different ways. The third string is denoted using single quotes (''
) and also contains a double quote ("
) in the text. Similarly, the fourth string is denoted using double quotes (""
) and contains a single quote ('
).
But what if we want both single ('
) and double ("
) quotes in the text? We'll come to this question shortly in the next section.
Each character in a string has a given position, known as its index. Indexes begin at 0 and increment by 1. Therefore, the first character of a string has index 0, the second has index 1, the third has index 2, and so on and so forth.
The total number of characters in a string is referred to as its length.
The length of a string can easily be obtained by accessing its length
property. Below shown are some examples:
''.length
'Hello'.length
' 1 1'.length
var str = 'Hello World!'
str.length
The length of the second last string (' 1 1'
) is 4
. This is because spaces are also valid textual characters, and are therefore considered in the length of a given string.
Accessing a given character is also a very common operation. It's done via bracket notation whose general form is string[index]
, where string
is the string value and index
is the index of its character that we wish to access.
As before, following are a couple of examples:
var str = 'Hello World!'
str[0]
str[1]
str[3]
str[8]
str[11]
To put a new line in a string, we use the special character sequence \n
. For instance, if we want to make an alert whose message is divided into two lines, we can do something as follows:
alert('This is an:\nAlert!');
Simple.
So this is it for our very quick recap of strings in JavaScript. Now that all this knowledge is fresh in our minds, it's time to unravel the deeper secrets to strings in the language.
Escaping characters
Let's come back to the problem of placing both single ('
) and double ("
) quotes inside a string.
As we know, if we create such a string either using a pair of single quotes (''
) or a pair of double quotes (""
), the similar character inside the string would come in conflict with the character used to denote the string and thus lead to an error as the following piece of text is treated as actual code.
There are essentially two ways to solve this problem in JavaScript:
- Escape the conflicting quote character
- Use a template literal
In this section, we focus on the former — escaping the conflicting quote character.
So what is meant by 'escaping'?
\
) so that the parsing engine doesn't treat the character literally.We are already familiar with escaping a character in a string using the character sequence \n
, back from the JavaScript Basics chapter.
Here the backslash character is placed before the character n
. The consequence is that the parsing engine doesn't treat n
as a literal n
but rather as a newline character.
Similarly, we can precede both the single ('
) and double ("
) quote characters with a backslash, i.e. \'
and \"
respectively, to signal to the engine that they aren't meant to denote the end of the string.
Consider the following example:
var s = 'He said: "It\'s impossible."';
console.log(s);
Here we denote the string using single quotes (''
), and then use both single and double quotes inside the string. For the double quotes, we don't use any backslashes since they obviously don't conflict with the single quote used to denote the string.
However, for the single quote, we precede it with a backslash (\
). This escapes the following '
character so that it is not interpreted as the end of the string.
Easy?
Let's try to denote this same string using double quotes (""
):
var s = "He said: \"It's impossible.\"";
console.log(s);
Obviously, now the single quote inside the string remains unescaped while the conflicting double quotes are escaped as \"
.
Simple, as before.
Now as you may agree, although backslashes solve such cases of character-conflict problems when denoting strings, they do complicate string expressions. The backslash characters inside the string's text seems nothing more than a mess, especially when used again and again and again.
Another way is to use a template literal, which is more than just an easy means of adding newlines in strings. The following section expands upon it.
Template literals
The ECMAScript 2015 specification introduced something called template literals, also known as template strings, into JavaScript. They represent an extremely simple, yet powerful feature.
``
), represents a string that could span multiple lines and contain JavaScript code.Let's break this definition down into simpler pieces.
First off, a template literal is denoted using a pair of backtick characters (``
).
Shown below are a couple of template literals:
`Hello World!`
`What's up?`
`He said: "It's impossible."`
Notice the last template string in this snippet. It uses both single ('
) and double ("
) quotes in the text without any backslash (\
) character. That's because the character used to denote the string is a backtick (`
) which obviously doesn't conflict with either the single or double quote characters in the string's text.
This is one application of template strings — to create strings that contain both single ('
) and double quotes ("
), without having to manually escape them.
Secondly, a template literal could span multiple lines where each line literally denotes a new line, unlike strings created using single (''
) or double quotes (""
) where we have to manually use the \n
sequence to denote a new line.
Consider the code below:
var s = `This is line 1,
and this is line 2.`;
console.log(s);
and this is line 2.
We create a variable s
and then assign a template string to it. Notice the new lines in this template string. These translate to actual newline characters in the final string.
As a rule of thumb, remember that:
If you put newlines in a template string, they would translate to actual new lines in the final string. This is the reason why sometimes template literals are also called multi-line strings in JavaScript.
Lastly, a template literal could contain JavaScript code — or strictly speaking, valid JavaScript expressions.
There is a special notation to represent a piece of code inside a template literal, shown as follows:
${expression}
We begin with the $
character, followed by a pair of curly braces {}
. Inside these, we put the desired expression.
This whole ${expression}
notation is replaced by the return value of expression
in the final string, after the engine parses the template literal.
An example follows:
var lang = 'JavaScript';
var s = `You are learning ${lang}.`;
console.log(s);
Here, the template string contains ${lang}
, which simply means that this notation would be replaced by the value of the variable lang
. lang
holds the string 'JavaScript'
, likewise s
becomes 'You are learning JavaScript.'
.
Easy?
Why are they called 'template literals'?
The word 'template' comes from the fact that such a string denotes a template expression, i.e. something that is eventually filled by actual stuff when parsed.
For instance, consider the string below:
`${name} is ${age} years old.`
This string is a template string in that it defines the general form of the piece of text denoted by the string. ${name}
is replaced by the name of an actual person, and ${age}
is replaced by that person's age.
A template string is parsed by the JavaScript engine at run time and any ${}
occurrences replaced by the return values of the respective expressions inside them.
Talking about the word 'literal', it comes from the fact that ``
is a string literal — exact representation of a string in source code, just like the string literals ''
and ""
.
Hence, the name 'template literals'.
String concatenation
One of the most common operations performed on strings is that of concatenation.
Concatenation is simply to join two strings together into one single string. In JavaScript, as we already know, string concatenation is done using the +
operator:
'Hello' + ' World!'
'10' + 20
10 + '20'
'10' + '20'
10 + {}
In JavaScript, there isn't a strict necessity to have the operands of +
(when used in the string concatenation context) be strings; they can well be numbers, Booleans. This is particularly evident in the second and third statements above.
Now here's one catch: In JavaScript, the same +
symbol is used as the addition operator to add numbers together and as the string concatenation operator to concatenate strings together.
The question is how does the engine decide what to do when it encounters the +
operator?
Fortunately, it's not that difficult.
String concatenation or arithmetic addition
If either of the operands of +
is a value that doesn't have a meaningful conversion to a number, string concatenation is performed after converting both the operands into a string.
Otherwise, arithmetic addition is performed after converting both the operands into a number.
Consider the following snippet:
10 + 20
'10' + 20
10 + '20'
'10' + '20'
10 + {}
In the last statement here, {}
doesn't have a meaningful number representation and, therefore, string concatenation is performed by the +
after coercing both the given operands into a string.
As we shall learn later on in this course, {}
gets converted to the string '[object Object]'
, hence 10 + {}
becomes '10' + '[object Object]'
.
Head over to the console and start experimenting with the +
operator, trying a host of different kinds of values. Exploration is interesting!
Moving on, another important thing to keep in mind when working with string concatenation is that +
is left-associative in nature.
That is, when there is a contiguous sequence of +
operators, the leftmost operands are grouped together and thus operated first.
Sometimes, this could lead to unexpected behavior. For example, consider the following code:
console.log('Price: $' + 10 + 20);
Here we expect that the output produced would be Price: $30
, however that's not the case. Instead, we get the following:
But why?
That's just because +
is left-associative. The expression above is evaluated as follows: ('Price: $' + 10) + 20
.
- The leftmost
+
is resolved first to give'Price: $10'
. - Next up, the expression becomes
'Price: $10' + 20
, which evaluates down to'Price: $1020'
.
+
been right-associative, the expression would've been evaluated as 'Price $:' + (10 + 20)
, giving us 'Price: $30
. However, clearly, that's not the case in JavaScript.If we want to achieve the expected output in the code above, we ought to manually group the arithmetic expression as follows:
console.log('Price: $' + (10 + 20));
As a rule of thumb, whenever in doubt regarding an operator's associativity and precedence, just group an expression containing it with ()
to get it evaluated first.
Immutability of strings
Recall from JavaScript Data Types that all primitives in JavaScript are immutable in nature. That is, once created, they couldn't be modified.
Any modifications desired to be made to an immutable value gets performed on a copy of the value, not on the actual value. This is an extremely crucial thing to keep in mind.
Strings, as we know, are primitives as well and are likewise immutable too.
This can be witnessed in the code below:
var s = 'Good';
s[0] = 'F';
console.log(s);
As can be seen, the log made in the console is exactly the same string s
created at the start of this program. The statement s[0] = 'F'
is simply ignored.
The reason is because strings are immutable in JavaScript — we can NOT mutate them.
'Food'
in the console instead of 'Good'
.Modifying a given character in a string
The immutability of strings can sometimes become an issue (but honestly only sometimes).
For example, in the code above, we wish to replace the first character of the string (which is G
) with F
. However, since JavaScript doesn't entertain direct mutation of a string's character, the statement goes ignored.
If we really wish to replace a given character (whose index we know) with another character in JavaScript, we can use the string slice()
method to create a new string with the new character.
(We shall explore the slice()
method in more detail in the next chapter, JavaScript Strings — String Methods.)
Here's an example:
var s = 'Good';
var i = 0; // The index of the character to replace
s = s.slice(0, i) + 'F' + s.slice(i + 1);
console.log(s);
The String()
constructor
As we saw in the JavaScript Data Types chapter, a string could also be created using the String()
constructor.
We start with the new
keyword, followed by the String()
function's invocation. Inside the parentheses, we specify an optional value which would be coerced into a string and set as the current string object's value.
The String()
constructor returns back a String
object, NOT a string primitive.
This object has multiple properties and methods available on it, such as length
, toLowerCase()
, toUpperCase()
, slice()
, and so on and so forth.
Below we create a String
object and convert it into all uppercase characters:
var strObj = new String('Hello World!');
console.log(strObj.toUpperCase());
Now, let's review the difference between a string primitive and a string object (which we shouldn't ever be using manually).
One of the most confusing things for newbie JavaScript devs is that most of the primitive values such as number, strings, and Booleans have equivalent object values in the language, so what should one use — primitives or objects.
For instance, consider the code below:
var str = 'Hello World!';
document.write(str);
Here we have a string str
that is output to the document.
The same code could be rewritten as follows:
var strObj = new String('Hello World!');
document.write(strObj);
This time we have a string object strObj
that is output to the document.
The question stands: Which of these to use?
Well, we discussed the answer to this in detail in the JavaScript Data Types chapter. Essentially the idea is the same for all primitive values with corresponding constructors. Let's go over it one more time for the case of strings.
String primitives vs. string objects
The purpose of these constructor objects is to box primitive values.
When we create a string primitive and access a property on it, JavaScript automatically converts the primitive into the equivalent object behind the scenes, accesses the property on that object, and finally throws away the object in the end.
Let's see how this works.
Consider the code below:
Here we create a string primitive and then access the length
property on it. This length
property doesn't exist on the string, since it is a primitive value:
var str = 'Hello World!';
console.log(str.length); // `length` doesn't actually exist on str
What JavaScript does before accessing the length
property of this string primitive is that it boxes the primitive into an object.
Shown below is a rough depiction of what happens behind the scenes:
var str = 'Hello';
var _str = new String(str);
console.log(str.length);
_str = null;
- JavaScript automatically creates a new variable
_str
and assigns it aString
object that is constructed using the existing valuestr
. - After this, it performs all the operations performed on
str
in the source code on_str
, and then in the end, sets_str
tonull
. - This last statement has the consequence of the string object stored in
_str
being garbage-collected.
All this works under the hood, and we feel as if we accessed a property on the string primitive stored in str
.
How clever is JavaScript!
So to boil it down, the use of new String()
is limited to the JavaScript engine — we don't need to use these to create the respective values in our code snippets.
In fact, if we were to do so, we would be complicating our code visually and in terms of memory.
Visually, because so many constructor expressions could unnecessarily elongate the source code. In terms of memory, because we are creating objects and objects and only objects; these objects incur a lot of overhead in addition to the actual wrapped primitive value.
This doesn't happen with primitives — they just store the value; no overheads.
In addition to this, many operations, such as typeof
, would NOT work as expected on these objects wrapping up given primitives.
Woah, this was some solid theory under discussion, once again!
In short, it's a complete mess to use these primitive-wrapping constructors explicitly in code to create strings (or even numbers or Booleans).
One extremely crucial thing to remember is that String()
, the function called in the function context, is different from new String()
, the function called in constructor context. The former coerces a given value into a string primitive whereas the latter converts a given value into the string object.
This means that you could use String()
(without new
) to convert an arbitrary value in JavaScript into a string primitive.
When the new
keyword comes into the game, then it's a different story, as we've discussed in detail above.
Moving on
Understanding these concepts are imperative before you move on to explore string methods in an upcoming chapter. If you are unable to understand one of two of them, consider reviewing the sections above and see if you understand it on reading them twice or even more times.
Experiment around with these concepts in the console, and once you are well off with them, only then consider moving on to the next chapter. It's important for you to have a strong foundation before attacking slightly more challenging arenas.
Keep learning and progressing!