Python List Comprehensions

Chapter 22 18 mins

Learning outcomes:

  1. What is a list comprehension
  2. Creating simple comprehensions
  3. Using conditional expressions
  4. Using function calls
  5. Using if in comprehensions
  6. Nesting comprehensions and creating 2D lists

Introduction

In the previous chapter, we worked with all list method available in Python. Often times, it's required that a list be initialised with a given element or constructed completely.

For instance, suppose that a 3 x 4 matrix is required in an application with all 1's. To create such a list in the most basic way one would use the for loop, like shown below:

matrix = []

for i in range(3):
    matrix.append()
    for j in range(4):
        matrix[i].append(1)

As you would agree, this code is too long for a very simple use case. Luckily Python provides a much conciser way — a list comprehension.

Simple list comprehensions

To start with, a list comprehension is just syntactic sugar over the loop code shown above. It allows us to define lists in a very compact manner.

Internally, Python translates list comprehensions into the same for loop syntax we saw above.

The general form of simple list comprehension is shown as follows:

[expression for var in iterable]

First comes an expression and then a for clause to iterate over an iterable. For each iteration of this for clause, expression is evaluated and put into the list being created.

The identifier var is available in expression.

If the general form shown above is assigned to a variable l, it's effectively equivalent of the following code:

# suppose l = [expression for var in iterable]
# then it would be equal to the following

l = []

for var in iterable:
    l.append(expression)

Let's take a quick example:

Say we want to create a list of 10 numbers, all 0. Using a list comprehension, we would write this:

nums = [0 for i in range(10)]

As we have said before, whatever expression is provided, it's evaluated and then put inside the new list. In this case, the expression is 0 evaluates to 0, and so what gets added to the list in each iteration is a 0.

Let's take another example.

Say we want to create a list of the first 10 square numbers, starting at 0. Using a list comprehension, we would define it as follows:

[i ** 2 for i in range(10)]

i ** 2 is evaluated at each iteration of the for loop that runs from i = 0 to i = 9, and then appended to the list being created.

For example, in the first iteration i is 0, likewise i ** 2 is 0, and therefore what gets saved is also 0. In the second iteration, i is 1, thus i ** 2 is 1, and so what gets added to the list is 1. In the third iteration, i is 2, thus i ** 2 is 4, and consequently 4 is put at the end of the list.

This goes on upto i = 9, at which point the value added to the list is 81. Altogether, this comprehension returns the following list:

[i ** 2 for i in range(10)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Any valid expression can go inside a list comprehension. We can use function calls, operators, even conditional expressions.

Conditional expressions

A conditional expression evaluates a condition and returns a value if it's True, and another value if it's False.

We can use conditional expressions in a list comprehension to make it even more interesting.

Say we want to create a list of 10 numbers that begin at 0 and alternate between 1 and 0. For example, in the case of 4 numbers the list would be [0, 1, 0, 1].

To solve this problem we can use a conditional expression to our advantage:

[0 if i % 2 == 0 else 1 for i in range(10)]
[0, 1, 0, 1, 0, 1, 0, 1, 0, 1]

For each iteration, the conditional expression 0 if i % 2 == 0 else 1 gets evaluated and the final value added to the list.

In the first iteration, i is 0, and therefore divisible by 2; likewise the if clause in the conditional expression gets executed returning 0. In the second iteration, is is 1, which is not divisible by 2; likewise, the if clause doesn't get executed returning 1 from the else clause.

In the third iteration, i is 2, and therefore divisible by 2; likewise, once again, the if clause gets executed returning 0. This goes on until the list reaches a length of 10.

In another instance, suppose we want to create a list of the first 10 integers, starting at 0, whose nth element is -1 if the nth integer is divisible by 3.

The final result should be [-1, 1, 2, -1, 4, 5, -1, 7, 8, -1], with all elements divisible by 3 replaced with -1.

[-1 if i % 3 == 0 else i for i in range(10)]
[-1, 1, 2, -1, 4, 5, -1, 7, 8, -1]

If, for each i, i % 3 == 0 evaluates to True i.e the number is divisible by 3, -1 is put into the list. Otherwise, the same value i is put into the list.

A list is to be created of the first 15 integers, starting at 0, whose nth element is the square of the integer if it is divisible by 4, or otherwise 0.

Construct a list comprehension to create this list.

The condition to be checked is i % 4 == 0. If it returns True i.e if i is divisible by 4, we put i ** 2 into the list. Otherwise we put a 0. Moreover, this time the for loop iterates over range(15).

Altogether, this leads to the following comprehension:

[i ** 2 if i % 4 == 0 else 0 for i in range(15)]
[0, 0, 0, 0, 16, 0, 0, 0, 64, 0, 0, 0, 144, 0, 0]

Function calls

Function calls are also expressions and can therefore be validly used in a list comprehension. We can pass the identifier of the for loop of the comprehension to the function which can then process it accordingly.

Below we demonstrate a straightforward example.

Say we want to create a list from a string whose nth element is the nth element of the string uppercased and surrounded by two spaces. For instance, the string 'Bye' should translate to the list [' B ', ' Y ', ' E '].

Using the string upper() method this can be accomplished as follows:

s = 'Bye'
l = [' ' + char.upper() + ' ' for char in s]

print(l)

for char in s iterates over each character in s and assigns it to char. In the comprehension's expression, char is uppercased and surronded by a single space on both sides.

Let's take another example.

Suppose we want to create a list of the first 10 factorial numbers, starting at the factorial of 0. Calculating the factorial of a number n requires to multiply it with all the integers lesser than it and greater than 0.

On these grounds, the factorial of 1 is 1, that of 2 is 2 (2 x 1), that of 3 is 6 (3 x 2 x 1), that of 4 is 12 (4 x 3 x 2 x 1), and so on.

We'll create a function factorial() to evaluate and return the factorial of a given number n, and then use this function in our list comprehension to create the aforementioned list.

def factorial(n):
    num = 1
    for i in range(1, n + 1):
        num *= i
    return num

factorial_nums = [factorial(i) for i in range(10)]

print(factorial_nums)
[1, 1, 2, 6, 24, 120, 720, 5040, 40320, 362880]
Note that the math module offers its own factorial() function which could've been used here, instead of our custom function. Nonetheless, we wanted to demonstrate a case where a function invocation is required in a comprehension, and the factorial example worked well enough.

More filtering

A conditional expression, as we saw above, can be used to determine what to put in a list from two possible choices — one given by the if clause and the other given by the else claue.

It works well in a decent amount of cases, however not all of them. Sometimes, it's desired to put an element into a list only if some condition is met; if it's not met, nothing shall be done.

As we've seen above, with a conditional expression, it's guaranteed that something will be added to the list in each iteration. What we want is to proceed with an iteration only if a given condition is met, otherwise move to the next one.

How to solve this problem?

Well, we can use an if clause following the for clause.

The general form is as follows:

[expression for var in iterable if condition]

Only if condition evaluates to True, is expression evaluated and put into the list.

As before, if we assign this list to a variable l, like this:

l = [expression for var in iterable if condition]

this general form would be equivalent to:

l = []
                
for var in iterable:
    if condition:
        l.append(expression)

See how the append() statement comes inside the if clause. This confirms the fact that the list is mutated only if the given condition is filled, otherwise the next iteration is considered.

Let's work with this type of a comprehension.

Suppose we want to create a list of the first 10 non-negative even integers. Using a comprehension, we can do so as follows:

evens = [i for i in range(20) if i % 2 == 0]
print(evens)
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

The if clause is evaluated as many times as the number of iterations. In contrast, the expression on the left-most side is evaluated only when this if clause gets fulfilled.

Denoting this list is also possible — in fact, recommended — using the range() function, and then converting the sequence into a list by passing it through list().

Another example using an if clause follows.

We need to create a list of the cubes of the first 10 positive even integers. Note that this is not possible solely using range().

If the element under inspection is an even integer, we proceed and compute its cube. Otherwise, we continue on with the next iteration.

Translating this into a list comprehension, we have the following:

even_cubes = [i ** 3 for i in range(2, 22) if i % 2 == 0]
print(even_nums)
[8, 64, 216, 512, 1000, 1728, 2744, 4096, 5832, 8000]

Nested loops

It's not always desired to create one-dimensional lists. Often times, two or three dimensional lists are the point of concern.

We've already seen how to initialise simple lists using comprehensions above. Python's list comprehensions enable one to define lists with more than one dimension. Let's see how.

Just like we can nest for loops within one another as shown below:

for var1 in iterable1:
    for var2 in iterable2:
        pass

we can do so inside a comprehension as well, by giving the nested for clauses one after another.

For instance, a single-nested loop would be given as follows:

[expression for var1 in iterable1 for var2 in iterable2]

Effectively this is equivalent to the following:

l = []

for var1 in iterable1:
    for var2 in iterable2:
        l.append(expression)

Anyways, let's use syntax to define a couple of lists.

Below we create a list of co-ordinates where the x co-ordinate ranges from 0 to 5, and y ranges from 0 to 4:

coordinates = [(i, j) for i in range(5) for j in range(4)]
print(coordinates)
[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (4, 0), (4, 1), (4, 2), (4, 3), (4, 4), (5, 0), (5, 1), (5, 2), (5, 3), (5, 4)]

See how both i and j are available in the expression (i, j). This is simply because, the expression is evaluated only after all the respective for clauses have been iterated over at least once.

Moving on, it's also possible to nest double-nest a comprehension by using a totol of 3 for clauses in it, as illustrated below.

We create a list of co-ordinates where the x co-ordinate ranges from 0 to 2, y ranges from 0 to 2, and z ranges from 0 to 1:

coordinates = [(i, j, k) for i in range(3) for j in range(3) for k in range(2)]
print(coordinates)
[(0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1), (0, 2, 0), (0, 2, 1), (1, 0, 0), (1, 0, 1), (1, 1, 0), (1, 1, 1), (1, 2, 0), (1, 2, 1), (2, 0, 0), (2, 0, 1), (2, 1, 0), (2, 1, 1), (2, 2, 0), (2, 2, 1)]

In particular, we can nest as many for's inside a main for, by putting them one after another in the list comprehension. The later the for clause, the deeper it is.

2D Lists

List comprehensions are really flexible in that one can create numerous kinds of lists using them. One case is of 2D lists i.e lists within lists.

The idea is that in every iteration of the main for loop, we put a list into the main list. This list can be defined manually, or using another comprehension.

Let's get working.

Below we create a 3 x 4 matrix, initialised with all 0's:

[[0, 0, 0, 0] for i in range(3)]
[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]

Here's what's happening over here. For each iteration of the for clause, the expression [0, 0, 0, 0] is evaluated and put into the main list. Since [0, 0, 0, 0] merely evaluates to [0, 0, 0, 0], this list is put into the main list a total of three times.

What we get in the end is a list containing three [0, 0, 0, 0] lists.

Since this was a simple case we directly created the list [0, 0, 0, 0], and initialised it with all 0's. However, sometimes this is inefficient, or completely impossible, if done manually. An example follows.

We want to create a 5 x 30 matrix initialised with all 0's.

Now creating a list of 30 0's is not feasible if done manually. Rather, this has to be done using another comprehension.

First try to come up with a solution to this yourself and then move on to our explanation.

We need 5 rows in the matrix, likewise the for clause in the main comprehension runs (from 0) upto 5. Then for each row, we need 30 columns; likewise we use a comprehension that creates a list of 30 items.

To sum it all up, this is what we get:

[[0 for j in range(30)] for i in range(5)]
[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]

In each iteration, the expression [0 for j in range(30)] is evaluated and put into the main list. As we know, this expression returns a list of 30 elements, all 0; likewise, in each iteration this list is added to the main list.

In the end, we get a list of 5 elements, each of which is itself a list containing 30 elements, all 0. This is a 5 x 30 matrix.