Python Function Closures

Chapter 33 21 mins

Learning outcomes:

  1. What are closures
  2. A simple closure in Python
  3. Cell and free variables
  4. The __closure__ attribute
  5. More examples of closures
  6. Closure ≠ returned function

Introduction

As we learnt in the Python Function Basics chapter, when a function returns something, all its local variables are deleted. However, technically, the scenario is different when the return value is a function itself utilising the environment of its enclosing function(s).

The returned function here together with its enclosing environment is referred to as a closure. In this chapter, we'll learn function closures in detail and see how to construct functions that can generate tailored functions.

What is a closure?

The concept of a closure is very basic:

A function along with the environment of its enclosing function, are collectively referred to as a closure.

Although this definition correctly describes a closure, it doesn't explain what the different parts in it actually mean. For example, what is meant when we say the 'environment of the enclosing function'.

To truly grasp the essence of a closure, we need to start from the very basics...

As we already know, it's possible to nest functions within functions in Python. Moreover, the inner function might additionally use data defined inside the outer function.

Now if this is the case and the outer function returns the inner function, a problem arises.

The outer function deletes all of its local variables when it returns i.e when it exits. So if the returned function wants to access something in the outer function, how could it do so, when the function's environment has been deleted.

This is where the concept of a closure was born.

The returned function holds onto the environment of its enclosing function(s), so that when it refers to any variable in the outer function(s), the variable still exists, even though the outer function(s) has exited.

Let's recap all this with the help of an example.

A simple closure

We'll create a function B() nested in a function A() and then return it. Finally, we'll call A() and see how the returned value operates.

Consider the code below:

def A():
    x = 10
    def B():
        print(x)
    
b = A()

Notice the definition of B() — it refers to a name defined in the local scope of the function A().

When we call A() here's what happens:

  1. A local variable x is created and assigned the value 10.
  2. The inner function B() is created.
  3. The function B() is returned. It saves its enclosing environment, which is the local environment of A().
  4. The returned function is stored in b and so A() exits
  5. b() is invoked. Note that the value returned by A() is saved in the variable b(), which means that b is now a function and likewise callable.
  6. b() is called. It print x. This x is first searched in the local lexical context of B(). Nothing is found here.
  7. Then the enclosing lexical context is searched for it. This context is stored by the function itself — not by the enclosing function. That is, the function b() is a closure and so has its enclosing function saved in itself.
  8. In this example, x exits and is therefore used to resolve the name x (in print(x)).

The most beautiful thing over here is the fact that although A() has returned, the function B() (which is saved in b) still has access to its local variable x.

This is because references to the local variables of A() are stored in the function B(); and so not garbage collected by Python.

Technically speaking, when a function exits, all its local names are removed from memory, NOT the values those names refer to. It's the garbage collector that continuously checks for values with zero references and then garbage collects them.

In the case above, when A returns, its name x is removed from memory, but its value is still stored in memory, under the same name x. This name x is, however, now stored under the function object B().

To boil it all down, a closure remembers its lexical environment.

Although, the concept in itself is quite elementary, it's extremely powerful. Using the concept of closures, we can define functions that can generate given kinds of functions.

Using closures more

Let's consider a more practical-level usage of a closure in Python.

Below we define a function makeDivider() that takes in an argument n and returns a function to compute the nth fraction of a given number x:

def makeDivider(n):
    def divider(x):
        return x * 1 / n
    return divider

Next, we create two functions using makeDivider() and save each of them inside half and quarter respectively.

def makeDivider(n):
    def divider(x):
        return x * 1 / n
    return divider

half = makeDivider(2)
quarter = makeDivider(4)

As the names suggest, half() serves to half its given argument and return it, while quarter serves to return the quarter of its argument.

Let's call both of these functions on a couple of number arguments and see what we get:

half(5)
2.5
half(100)
50
quarter(16)
4
quarter(100)
25

As expected, the functions operate just as they are named.

We can create any kind of divider by providing the denominator while calling makeDivider().

Following we create a function that multiplies its argument by 1 / 3:

def makeDivider(n):
    def divider(x):
        return x * 1 / n
    return divider

thirds = makeDivider(3)

And here we use it to get the one-third of given numbers.

thirds(30)
10
thirds(150)
50
thirds(10)
3.3333333333333335

This is the power of closures.

All the three functions here — half(), quarter() and thirds() — are closures. They essentially save the argument n sent to their enclosing function i.e makeDivider() and then on their invocation, retrieve it from storage.

Now, it's quite general to use the word 'storage' here. In the next section, we learn two terms that are more formal when we discuss closures.

Cell and free variables

For a given function:

A local variable is referred to as a cell variable of the function if it's used by an inner (nested) function.

If we head over to the docs of Python for cell variables, we see that it says:

“Cell” objects are used to implement variables referenced by multiple scopes...

A cell variable, therefore, is simply a variable used in multiple scopes.

In the case of function, a cell variable is a local variable used in the local scope of the function and also in the local scope of a nested function.

Another similar concept to cell variables is that of free variables.

For a given function:

A free variable is a variable not defined in the local scope of the function, neither a parameter, nor a global.

In simpler words, a free variable is a non-local variable of a function, defined in an outer function.

If a local variable x of a function A() is a cell variable used by an inner function B(), then for B(), that variable x is a free variable.

Note that if B() defines its own local variable x, then x would no longer be considered a free variable. This is because a free variable is not local to a function!

Let's now explore cell variables and free variables in real code snippets.

Consider the following code:

def foo():
    a = 10
    def bar():
        print(a)
    return bar

We have a function foo() that defines a local variable a, a function bar(), and then returns bar. The function bar() prints the variable a.

Now before we go any further, let's answer a couple of questions.

Does the function foo() have a cell variable, a free variable, both of them, or none?

  • A cell variable
  • A free variable
  • Both of them
  • None of them

Does the function bar() have a cell variable, a free variable, both of them, or none?

  • A cell variable
  • A free variable
  • Both of them
  • None of them

foo() has a local variable a that's used by the nested function bar(). Hence, a is a cell variable of foo(). bar() uses a non-local variable a, likewise a is a free variable for bar().

Why not confirm this using some Python code?

As we saw in the previous Python Functions — Code Objects chapter, for any given function, we can inspect its cell and free variables using the co_cellvars and co_freevars attributes on its __code__ object, respectively.

Both the attributes hold a tuple containing the names of the respective variables.

Let's investigate them on our functions foo() and bar().

First on foo():

foo.__code__.co_cellvars
('a',)
foo.__code__.co_freevars
()

And now on bar(). To inspect bar(), we'll need to invoke the function foo() and then perform the inspection on the returned value (which is the function bar()):

bar = foo()
bar.__code__.co_cellvars
()
bar.__code__.co_freevars
('a',)

The output matches the reasoning we gave above. Perfect!

Why not try another example?

def foo():
    a = 10
    b = 20

    def bar():
        a = 20
        print(a, b)

    return bar

Which of the following are cell variables of foo()?

  • a
  • b
  • Both of these
  • None of these

foo() defines two locals a and b, then a function bar(), and finally returns it. bar() creates a local variable a, and then print that along with the outer variable b.

Likewise, for foo(), b is a cell variable since it's used by the inner function bar(). And for bar(), b is a free variable, while a is just a normal local variable.

Shown below is the inspection of cell and free variables of foo() and bar():

foo.__code__.co_cellvars
('b',)
foo.__code__.co_freevars
()
bar = foo()
bar.__code__.co_cellvars
()
bar.__code__.co_freevars
('b',)

Just as we reasoned, we get the expected output.

Time to take it a notch up, and consider a more challenging example.

def foo():
    x = 10

    def bar():
        y = 20
        def baz():
            print(x, y)
        return baz

    return bar

Now you should give it a try. Reason as to why you think that a given variable is a free variable, or a cell variable, or both, or none.

Is x a cell variable for foo()?

  • Yes
  • No

Which of the following statements is correct for the function bar():

  • It has a cell variable x and a free variable y.
  • It has a cell variable y and a free variable x.
  • It has a two cell variables x and y.
  • It has a two free variables x and y.

Which of the following statements is correct for the function baz():

  • It has a two cell variables x and y.
  • It has a two free variables x and y.

Let's reason the code.

Here's the glimpse into it once again:

def foo():
    x = 10

    def bar():
        y = 20
        def baz():
            print(x, y)
        return baz

    return bar

foo() defines a local variable x, then a function bar() and finally returns it. This local variable x is used by an inner function (specifically by baz()), hence x is a cell variable for foo().

Moving on, bar() defines a local variable y, a function baz() and then returns the function. This local variable y is used by an inner function (specifically by baz()), likewise y is a cell variable for bar().

Apart from this, bar() refers to the variable x of foo(). It's important to note that it doesn't do this directly — rather an inner function of bar() refers to the variable x, which we know is baz(). So technically, bar() requires x, and so x is a free variable for bar().

Lastly, the function baz() prints the variables x and y. These variables are not local to the function, not even global — rather defined in the enclosing functions. Likewise, x and y are both free variables for the function baz().

The __closure__ attribute

For a given function, its code object contains a co_freevars attribute that contains the names of all the free variables of the function.

So using this we could get the names of the free variables. But what about their values? How do we get them?

Well Python does provide a way to retrieve the values of the free variables. That is using the __closure__ attribute.

For a given function:

Its __closure__ attribute contains a tuple holding the values of each of its free variables; or else the value None.

Each element of the tuple is a cell object. It's a cell object because it essentially represents a cell variable of an enclosing function.

The actual value of the variable can be retrieved from a cell object using its cell_contents attribute (which is the only attribute of a cell object).

So coming back to our question, the co_freevars attribute of the function's code object holds the names of all free variables, while the __closure__ attribute holds their values.

The order of elements in both these locations is the same — the first element of co_freevars contains the name of the alphabetically-first free variable, while the first element of __closure__ contains its corresponding value.

The following example demonstrates __closure__ in use:

def foo():
    a = 10
    b = 20

    def bar():
        print(a, b)
    
    return bar

First let's retrieve the value of __closure__ on the function bar().

bar = foo()
bar.__closure__
(<cell at 0x7f9d64c68460: int object at 0x954f40>, <cell at 0x7f9d643a4eb0: int object at 0x955080>)

We get a tuple of two elements — precisely a tuple of two cell objects — each corresponding to a free variable.

Now, let's, specifically get the content inside each element of the tuple:

bar.__closure__[0].cell_contents
10
bar.__closure__[1].cell_contents
20

As stated before, if a function doesn't refer to any cell variable of any of its enclosing functions, i.e it has no free variables, then its __closure__ attribute is equal to None.

This can be seen as follows:

def foo():
    a = 10
    def bar():
        a = 20
        print(a)

    print(bar.__closure__)

foo()

Here, the function bar() creates a local variable a and prints it. It doesn't use any variable from the enclosing function, which is foo(); hence, bar() doesn't have any free variables.

This is confirmed by the output of line 7 — we get back the value None signifying that bar() hasn't stored any cell variable from an enclosing function.

None

Closure ≠ returned function

Developers new to the concept of closures generally memorize one thing to define the concept. It is that a closure is any function returned from another function. This is misinterpretation.

Although such a function is a closure, a closure is not just limited to functions returned from other functions. Essentially, any function that captures its enclosing environment inside its __closure__ attribute is a closure.

For instance, in the code below, bar() isn't returned by the function foo(), yet it is a closure.

def foo():
    x = 10
    def bar():
        print(x)
    
foo()

This is because it has a free variable x which is stored in its __closure__ attribute.

So defining a closure as a function returned from another function is, strictly speaking, wrong. A closure is the collective name for a function along with its lexical environment.

But if someone asks you to give an example of a closure, you could say 'Well, a function returned from another function that uses one of its local variables, is a closure'.

This would be appropriate.