Course: Python

Progress (0%)

# Python Set Methods

Chapter 26 14 mins

Learning outcomes:

1. Adding elements using `add()`
2. Removing elements using `remove()`, `discard()` and `pop()`
3. Checking for subsets via `issubset()`
4. Checking for supersets via `issuperset()`
5. Performing set operations using `union()`, `intersection()`, `difference()` and `symmetric_difference()`
6. Update a set through `update()`, `intersection_update()`, `difference_update()` and `symmetric_difference_update()`
7. Clearing a set via `clear()`
8. Copying a set using `copy()`

As stated in the previous chapter, adding stuff to a set can only be done in one way and that's using the `add()` method.

It takes in a single argument and adds it to the given set.

``set.add(element)``

An example follows:

``````evens = {0, 2, 4, 6, 8}

print(evens)``````
{0, 2, 4, 6, 8, 10}

## Removing given elements

The `remove()` method can be used to remove stuff from a set.

As with `add()`, just provide it with the element you wish not to see anymore in the given set, and then wait for the method to do the magic.

``set.remove(element)``

Below we remove the number `4` from our `evens` set:

``````evens = {0, 2, 4, 6, 8}
evens.remove(4)

print(evens)``````
{0, 2, 6, 8}

Keep in mind that if the element to be removed doesn't exist in the set, `remove()` would throw a `KeyError` exception.

This can be seen as follows:

``````evens = {0, 2, 4, 6, 8}
evens.remove(10)``````
Traceback (most recent call last): File "stdin", line 2, in evens.remove(10) KeyError: 10

The `discard()` method is similar to `remove()` in that is also removes an element from a set.

``set.discard(element)``

Following we discard `4` from our `evens` set:

``````evens = {0, 2, 4, 6, 8}

print(evens)``````
{0, 2, 6, 8}

The only difference is that if the set doesn't contain the element to be removed, `discard()` does not throw an exception, unlike `remove()` which does throw one.

Following we remove a non-existent element from `evens`, yet get no sort of error thrown:

``````evens = {0, 2, 4, 6, 8}

A simple way to remember which of the two methods `remove()` and `discard()` throws an error is detailed as follows:

The word 'error' starts with an 'e', and of the two names 'remove' and 'discard', only 'remove' contains the 'e', implying that `remove()` throws an error.

## Pop elements

To remove an arbitrary element from a set, use the `pop()` method.

Since it doesn't remove any specific element, it doesn't require any argument.

Shown below is an elementary example:

``````evens = {0, 2, 4, 6, 8}
evens.pop()

print(evens)``````

## Check for subsets

A set `A` is said to be the subset of another set `B` if all its elements exist in `B`.

For instance, the set of even integers is a subset of the set of integers. Similarly, the set of integers is a subset of the set of real numbers and so on.

To check if a set is a subset of another set, we have at our dispense, the `issubset()` method.

``a.issubset(b)``

It returns `True` if `a` is a subset of `b`; or else `False`.

Let's inspect the method on two sets: one holding the first two non-negative evens and the other one holding the first five non-negative evens.

``````first_two_evens = {0, 2}
first_five_evens = {0, 2, 4, 6, 8}``````
``first_two_evens.issubset(first_five_evens)``
True
``first_five_evens.issubset(first_two_evens)``
False

Reading out the first statement: '`first_two_evens` is the subset of `first_five_evens`.' Since, this is true, what we get returned is indeed `True`.

The second statement reads as follows: '`first_five_evens` is the subset of `first_two_evens`.' Since, this is wrong, what we get returned is `False`.

## Check for supersets

A superset is the opposite of a subset. If `A` is a superset of `B`, then everything in `B` exists in `A`. To define it another way, if `A` is a subset of `B`, then `B` is a superset of `A`.

In Python, we can use the `issuperset()` method to check for supersets.

``a.issuperset(b)``

It returns `True` if `a` is a superset of `b`; or otherwise `False`.

Let's take the same example above:

``````first_two_evens = {0, 2}
first_five_evens = {0, 2, 4, 6, 8}``````
``first_two_evens.issuperset(first_five_evens)``
False
``first_five_evens.issuperset(first_two_evens)``
True

In the first statement we're saying: '`first_two_evens` is a superset of `first_five_evens`.'. Clearly, this is incorrect, likewise we get `False` returned.

The second staement goes like: '`first_five_evens` is a superset of `first_two_evens`.' As this is correct, we get `True` returned.

## Set operations

In the previous chapter, we came across set operations in Python powered by operators; `|` for union, `&` for intersection, `-` for difference, and `^` for symmetric difference.

The `set` data class also provides these operations via method calls. The methods `union()`, `intersection()`, `difference()` and `symmetric_difference()`, all take in a given set, perform the respective operation, and return the resulting set.

As with the operators, none of these methods mutates the original set.

Below we demonstrate all four of these methods:

``````a = {0, 2, 4}
b = {2, 3, 5}

print('Union:', a.union(b))
print('Intersection:', a.intersection(b))
print('Difference:', a.difference(b))
print('Symmetric difference:', a.symmetric_difference(b))``````
Union: {0, 2, 3, 4, 5}
Intersection: {2}
Difference: {0, 4}
Symmetric difference: {0, 3, 4, 5}

A common question arising in the minds of developers at this stage is what's the purpose of these four methods, if the same operations can be done using operators.

#### Purpose of the methods for set operations

Essentially, there is absolutely no difference in their operation. Both the methods and the operators perform the same operation on two given sets.

The main difference is that the methods can accept any iterable argument, as compared to the operators which can only entertain sets (as operands).

The reason of not allowing iterables to be used alongside the operators is to prevent ambiguous expressions such as the one shown below:

``set('123') & '345' # Python doesn't allow this!``
{3}

This expression looks fairly ambiguous — a set is being intersected with a string. One might think that the set `{'1', '2', '3'}` is being intersected with the set `{'345'}`.

Compare this to the expression:

``set('123').intersection('345')``

Here we can clearly see that both the strings are wrapped up in function calls, implying that they aren't directly being used in the intersection operation, but first being coerced into a set.

Visually, this expression looks much better than the previous one.

Preventing the operators from operating on any iterable basically prevents confusing results, such as the one we just saw above, and the one shown below:

``[1, 2] | '23' | ('4', '2') # Python doesn't allow this!``
{1, 2, 3, 4}

Here we're trying to compute the union of a list, a string and a tuple — which sounds really weird!

## Peform operation and update

Sometimes, when a given set operation is performed on a set `s`, it's further desired that we update it with the result of the operation.

For instance, consider the example below:

``````a = {0, 2, 4}
b = {1, 2, 3}

a = a & b
print(a)``````
{2}

We compute the intersection of `a` with the set `b` and then assign the resulting set back to `a`. In other words, we update `a` with the intersection set.

This can be done on any given set very easily using the assignment syntax, shown above. However, Python provides methods out of the box to do so.

The methods `update()`, `intersection_update()`, `difference_update()` and `symmetric_difference_update()` all take in a set as argument, perform the respective operation on the calling set and the argument, and finally update the calling set to the result of the operation.

It's also possible to provide more than one set as argument to these methods.

Here's what each method does:

1. `a.update(b)` computes the union of `a` and `b` and updates `a` to the result.
2. `a.intersection_update(b)` computes the intersection of `a` and `b` and updates `a` to the result.
3. `a.difference_update(b)` computes the difference of `a` and `b` and updates `a` to the result.
4. `a.symmetric_difference_update(b)` computes the symmetric difference of `a` and `b` and updates `a` to the result.

Consider the following code:

``````evens = {0, 2, 4, 6, 8}
evens.update({2, 4, 8, 12, 16})

print(evens)``````
{0, 2, 4, 6, 8, 12, 16}

The set `evens` is updated using the set `{2, 4, 8, 12, 16}`. Obviously, since `evens` already contains `2`, `4` and `8`, these won't (and technically can't) be added again to the set, as duplicates. It's only `12` and `16` that get added to the set.

Shown below is another example, using `intersection_update()`:

``````a = {0, 2, 4}
b = {1, 2, 3}

a.intersection_update(b)
print(a)``````
{2}

The intersection of the sets `{0, 2, 4}` and `{1, 2, 3}` is computed and saved in the variable `a`.

All these methods return `None` — they perform the set operation and then update the calling set as a side effect.

## Clearing everything

The `clear()` method serves the same purpose on sets, as it does on lists — clearing everything from them.

It can be handy when we want to erase all the contents of an exisiting set, without deleting the set itself.

Below shown is an example:

``````evens = {0, 2, 4, 6, 8}
print(evens)

evens.clear()
print(evens)``````
{0, 2, 4, 6, 8}
set()

## Copying a set

Since a set is a mutable data type, assigning a set to a variable and then assigning back this variable to another variable, creates two variables pointing to the same location in memory.

This means, that if we want to independently work on both the variables, we can't do so, since each of them refers to the same set in memory. This issue exists for all other immutable types as well, most commonly for the `list` and `dict` classes.

Fortunately, it can be easily avoided by copying the set. And to copy a set, we can use the `copy()` method.

Consider the code below:

``````evens = {0, 2, 4, 6, 8}
evens_copy = evens.copy()

evens.update({10}) # change evens

print(evens)
print(evens_copy)``````
{0, 2, 4, 6, 8, 10}
{0, 2, 4, 6, 8}

We first create a set `evens` and then make its copy and put that in `evens_copy`. Then we update the set `evens` to see whether or not the changes show up in `evens_copy`.

Since, `evens_copy` is a copy of `evens` (not pointing to it), it remains unchanged, as can be confirmed by the second line of output.

Note that the `copy()` method returns a shallow copy of a set.

There is no method to make a deep copy of a set in Python. In fact, it would be completely inefficient and senseless, if there was one. This is because, we can't access the content inside a set, and therefore it shouldn't make any difference if an element in a set is a copy of an element or the original one.

"I created Codeguage to save you from falling into the same learning conundrums that I fell into."

— Bilal Adnan, Founder of Codeguage