RegExp String Methods

Chapter 11 8 mins

Learning outcomes:

  1. String methods - search(), match() and replace()

Introduction

Uptil now we've seen a lot of concepts and basics in regular expressions, and it is now that we will see how to incorporate all of them in actually matching patterns against strings.

In this chapter we will take a look at all the string methods that accept regular expressions as arguments, and see how to work with them in solving pattern matching problems.

Search for a match

The first candidate we will explore is the method search().

search() takes a single argument of what to search for within the main string. The argument can be a string or a regular expression. Obviously for now, we are concerned mainly with the latter.

If a match is found, the method returns its index, or otherwise the value -1.

search() always returns the index of the first occurence of the given pattern, even if multiple occurences exist. This holds for regular expressions with the global g flag also - the method will regardless still return the position of the first match.

For instance, given the string str = "Two times two is four", and the expression exp = /two/ig the method str.search(exp) will return 0, since the first match is found starting at the 0th index.

Note that the flag i here will do an insensitive search and hence get the expression to match the first word "Two" as well.

Following is an illustration of a couple of examples.

Here the expression doesn't match anything in the string str, likewise we get -1 returned.

var str = "Hello World!";
var patt = /hello/;
console.log(str.search(patt)); // -1

In contrast here the expression does match the first word 'Hello' and likewise we get 0 returned.

var str = "Hello World!";
var patt = /hello/i;
console.log(str.search(patt)); // 0

Lastly consider the second example above, this time with the global flag. A novice developer might expect to get multiple indexes returned for multiple matches but the method search() doesn't work this way.

It will always return the index of the first match.

Consider the code below:

var str = "Hello World! Hello once again";
var patt = /hello/ig;
console.log(str.search(patt)); // 0

Here the expression /hello/ig matches both the substrings 'Hello' inside str, but when passed to the search() method, it only returns the first occurence's index.

Let's quickly test you skills and see whether you completely understood the working of search() or not.

Consider the code below:

var str = "A cute cat!";
var patt = /a/;
var index = str.search(patt);

What will be the value of the variable index?

  • -1
  • 0
  • 7
  • 8

What is the match?

The search() method will indeed do the job of confirming the presence of a given pattern in a string, but it won't tell us exactly which substring matched the pattern.

In other words if we wanted to extract out information from a string based on a regular expression we can't do that using search(). To do so we need the string method match().

In terms of arguments, match() operates exactly like search() - taking one argument that is the expression to match the string against. However in terms of behavior it is slightly different. Here's how.

Firstly if the expression, or search string, doesn't match anything null is returned.

Secondly if the expression does find a match and the g flag isn't set then an array will be returned containing information about the first match like it index, captures for groups and so on.

Finally if the expression does find a match and the g flag is set, then all the matches will be given as individual string values in an array.

Too much mess right? Consider the code snippets below to understand it all thoroughly.

First let's take a string and a pattern that doesn't exist in it:

var str = "1012 3200";
var patt = /hello/;
console.log(str.match(patt)); // null

Here, since /hello/ doesn't appear in "1012 3200", we get the null value returned.

Let's now try it with the second case.

var str = "1012 3200";
var patt = /\d+/;
console.log(str.match(patt));
// ["1012", index: 0, input: "1012 3200", groups: undefined]

In the code above the expression does find a match and the also global flag isn't set, hence we get an array object returned.

The first element of the array, at index 0, is always the match of the expression. That is, what substring did the expression match. Subsequent elements hold the captures for the groups appearing in the expression.

Apart from this three other properties exist on the returned array to denote some meaningful things about the regexp search. They are as follows:

  1. index holds the index of the matched substring
  2. input holds the string on which the search was performed
  3. group holds the captures of all named groups in the expression.

Finally considering the last case we have the following.

var str = "1012 3200";
var patt = /\d+/g;
console.log(str.match(patt)); // ["1012", "3200"]

Here the global flag is indeed present and hence we get all the matches returned as an array. Notice that this array only contains matches for the whole expression - any of the capturing groups are simply ignored in this array.

And in this way the match() method works to extract out matches from of a test string.

Now although it does a good job of returning the actual matches, it still lacks in returning a detailed information for all matches when used with the g flag. For example we can't get any of the back references for the capturing groups when we are searching for more than one match.

As we shall see in the next chapter on RegExp Methods the method which can solve this problem is exec() on RegExp() instances.

Replace stuff

Though we've seen the replace() method for quite a long while uptil now, just to get the menu of string methods all checked let's consider it once more.

The first argument of the method is the thing to search for in the main string whereas the second argument is the replacement string that replaces each match.

Consider two simples examples below:

var str = "I love cats. Do you love cats?";
var patt = /cats/;
str = str.replace(patt, "parrots");
console.log(str); // "I love parrots. Do you love cats?"

Here since the expression only matches the first of its occurence (as there is no global flag set), likewise only the first match is replaced.

var str = "I love cats. Do you love cats?";
var patt = /cats/g;
str = str.replace(patt, "parrots");
console.log(str); // "I love parrots. Do you love parrots?"

In contrast, here we have the global flag set on the expression, and hence it will match all of the occurences and consequently replace them all.

And this is all that we've got back here in string methods.