Introduction

Every programming language on this planet enables us to deal with data. That data can be one of serveral types such as an integer, a string, an object and so on.

Different languages segment data differently - some have many categories of data whereas some only offer a few with further subcategories in those few categories.

Talking about Python, it has its own data type system as we shall see, in detail, in this chapter.

Primitives vs. objects

Before we start the discussion on Python's data types, it's worthwhile to understand two commonly used terms in programming when discussing data types of a language: primitives and objects.

Any data type that is implemented in a language without any sort of binded information is known as a primitive.

An object is the exact opposite of this - it has information binded with it. Let's understand this using a very simple example.

In Java, we can create an integer using the int keyword followed by the same assignment pattern used in Python, as shown below:

int x = 10

The variable x here is considered a primitive. It has no properties or methods available on it - it is just a pure number present in memory. The integer data type in Java is a primitive data type.

Compare this to Java's string data type which is not primitive:

String s = "Hello World!"

In the snippet above, the variable s has many properties and methods available on it.

For example, s.length() returns the number of characters in s; s.toLowerCase() converts s into all lowercase characters on so on. length() and toLowerCase() here are part of the information binded with strings in Java.

The variable s does not hold a string directly - rather it holds an object which has some attribute pointing to the string data "Hello World!" in memory and some attribute pointing to information and functionality for that string data, like the length() and toLowerCase() methods.

If you don't understand any of these details now, don't worry - as you learn programming in general, the concept of primitives and objects would come naturally to you.

If you are really curious to understand this quickly then headover to our JavaScript course - in the first six chapters you'll not only learn what are primitives and objects but also one of the most popular languages out there - JavaScript!

Coming back to the topic we now know that a primitive data type can have no sort of information attached to it as compared to an object data type, which does have information attached - for instance, the string data type in Java that has methods attached to it, such as length(), toLowerCase() and so on.

Talking about Python, it has no primitive data type:

Everything in Python is an object.

Let's explore this in detail...

Everything is an object

Python is an object-oriented language where everything is an object. Now let's first understand what exactly is an object.

Think of the real world objects around you such as a computer - it has characteristics like color, size, weight, price and so on and similarly some behaviour as well - it can be powered on, shut down and so on.

Let's take another example: a toaster. It also has properties - color, size, weight, wattage, brand name and even behaviour like toasting bread.

This concept of an object is exactly what the term 'object' in Python and in all OOP languages refers to:

An object is an entity with properties and/or some behaviour.

This simply means that everything in Python has properties and/or behaviour attached to it.

But how do we confirm this fact?

There's a simple, yet clever way to do this.

In Python, passing a given value to the dir() function returns all the information binded with the value, in the form of a list.

Although, it's too early for now to completely understand the concept of a function or a list, it won't take long to grasp the outskirts of these concepts.

A function is a block of code that can be executed by calling the function. A function is called by writing the name of the function followed by a pair of () parentheses.

Here's how we would call the dir() function on an integer 10 in Python:

dir(10)

First comes the name dir followed by a pair of () parentheses. Inside these parentheses goes the integer 10. The integer 10 here is called an argument to the function dir().

An argument is data that we provide to a function to let it do its work.

Let's see what does dir(10) return:

dir(10)
['__abs__', '__add__', '__and__', '__bool__', '__ceil__', '__class__', '__delattr__', '__dir__', '__divmod__', '__doc__', '__eq__', '__float__', '__floor__', '__floordiv__', '__format__', '__ge__', '__getattribute__', '__getnewargs__', '__gt__', '__hash__', '__index__', '__init__', '__init_subclass__', '__int__', '__invert__', '__le__', '__lshift__', '__lt__', '__mod__', '__mul__', '__ne__', '__neg__', '__new__', '__or__', '__pos__', '__pow__', '__radd__', '__rand__', '__rdivmod__', '__reduce__', '__reduce_ex__', '__repr__', '__rfloordiv__', '__rlshift__', '__rmod__', '__rmul__', '__ror__', '__round__', '__rpow__', '__rrshift__', '__rshift__', '__rsub__', '__rtruediv__', '__rxor__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__truediv__', '__trunc__', '__xor__', 'bit_length', 'conjugate', 'denominator', 'from_bytes', 'imag', 'numerator', 'real', 'to_bytes']

As you can see, it returns a huge list of information. All this information is attached to every integer in Python.

Let's refer to one of this information:

10 .__add__(5)
15

__add__() is referred to as a method of the integer 10. Its purpose is apparent in its name - it adds the main integer to a provided value. In this case it adds 10 to 5 to yield 15.

In this way, every single information shown above can be called on the integer 10, and on all integers in Python.

And this confirms the fact that yes integers are objects, since they have information attached to them.

Everything in Python can be inspected using dir() and what we get in return is always a list of some information. This confirms the bigger picture - everything is, in effect, an object.

Note that this type model is not used in every programming language.

For instance, in Java, some data types such as integers, floats and Booleans are primitives i.e they are not objects, and so have no methods or properties available on them.

In terms of memory, this object type model puts overhead information to be carried around, however despite this it allows for quick and flexible programming, which rules out its weak point in many applications.

Programming day-to-day applications using a language that treats every data type as an object, such as Python, won't cause even the slightest of considerable performance janks! It's only in memory intensive applications such as 3D Games, that working with such languages becomes a concern.

Couldn't understand all this - no problem. All this will become clear with time as you learn Python and programming in general.

So what we've learnt so far is that everything in Python is of type object.

However, it's paramount to realise that not everything is the same type of object.

Integers are a different type of object as compared to floats. Strings are another type of object and so are Booleans. Everything is definitely an object, but a different kind of an object.

Take the example of your house - everything within it can be thought of as an object, but not everything is the same kind of object. You have chairs, tables, lights, fans, and so on!

From this point onwards, we'll be referring to these individual types simply as data types, and not as object types, given that you keep it in mind that every data type in Python is, in effect, an object type!

Integers

Integers are whole numbers, without a decimal point.

Examples include -2, -1, 0, 1, 2 and so on.

Even if a number is technically a whole number but has a decimal point in it, it is not classified as an integer. Rather, it's classified as a float, as we shall see in the next section.

For instance, 4.0 is technically a whole number as its fractional part is equal to zero. Nonetheless, Python recognises this as a float; not as an integer!

But how do we know which value is considered an integer and which one is considered a float?

Well one way is to use the type() function.

It works as follows: we provide it a value whose type we want to know, as an argument similar to passing a value to the dir() or print() functions. The function returns back the object type of the value, in a special notation.

As we shall see later on in this course, what type() actually returns back is technically the class of the given value.

Let's inspect the type of the numbers 4 and 4.0 in the shell:

type(4)
<class 'int'>
type(4.0)
<class 'float'>

As can be confirmed from the snippet above, 4 is an integer since type(4) returns <class 'int'>. Here int refers to an integer.

On the same lines, 4.0 is not an integer, since type(4.0) returns <float>.

Moving on, unlike many languages, Python sets no specific limit to the size of integers - they can be as large as one desires, but obviously within the limits of the machine being used.

You can't store something like 100100100 on a machine with a limited amount of memory!

Below we multiply two large numbers together to obtain an even larger number, yet capable of being processed by Python:

x = 1984548948495055640
y = 400379004593964645645405

z = x * y
print(z) # 794571732566449589026609381571299185334200

If you think this is big enough, consider the following code, where we generate a number spanning close to 5 lines!

z = 5 ** 500
print(z)
30549363634996046820519793932136176997894027405723266638936139092812916265247204577018572351080152282568751526935904671553178534278042839697351331142009178896307244205337728522220355888195318837008165086679301794879136633899370525163649789227021200352450820912190874482021196014946372110934030798550767828365183620409339937395998276770114898681640625

Having no sort of limit on the size of integers is one of the many reasons developers prefer Python in coding competitions (where numbers can easily go out of control!) and some number-intensive applications.

Floats

The second classification of numbers in Python is that of floats.

Floats, or floating-point numbers, are numbers with a decimal point.

Examples include -5.1, -0.7, 0.0, 3.89, 10.001.

x = 0.5

Floats in Python are based on the IEEE-754 double-precision floating-point format; the same format used in JavaScript for all numbers, and in Java for the double data type.

In this format, each floating-point number is represented using 8 bytes of memory.

Python floats aren't 8 bytes large!

Remember that in Python, a floating point number won't be 8 bytes large if you inspect it. Rather it would be greater than that. Why?

Simply because of Python's everything-is-an-object type system. Floats are also objects with attached information, and storing this information requires memory. This memory along with the 8 bytes of storing the actual floating point number (in the IEEE-754 format) adds upto something definitely greater than 8 bytes!

Let's inspect the type of floating-point numbers in the shell:

type(0.3)
<class 'float'>

As we saw before, type() called on floats returns <class 'float'> which is the class representing all the floats in Python. We'll see more details to this class in the Python Number Basics chapter.

Strings

In programming, a string is a sequence of textual characters. The number of characters in a string is known as its length.

In Python, a string can be created in one of the following ways:

  1. Using a pair of ' single quotes
  2. Using a pair of " double quotes
  3. Using a pair of ''' triple single quotes
  4. Using a pair of """ triple double quotes.

Consider the example below:

s = "Hello World!"

Here, s is a string of length 12; since it has a total of 12 characters.

Spaces are also characters and therefore, considered in the length of a string.

The len() function is used to figure out the length of a given string. It takes a string and returns back its length, as an integer.

Below we call len() on the string s:

len(s)
5

Python strings follow the UTF-8 encoding scheme, where each character takes a minimum of 8 bits in memory.

Moving on, let's finally explore what's returned when type() is called with a string:

type(s)
<class 'str'>

<class 'str'> is returned, since all strings in Python belong to the class str.

Booleans

One of the most useful concepts in computer programming is that of conditional execution. Conditional execution is when a piece of code is executed only if a given condition is met.

At the heart of this concept sits Booleans - that are simply true or false values.

In Python, the two Boolean values are True and False.

Let's create two Boolean variables:

is_raining = True
user_authorised = False

Both of these are considered reserved keywords by the language!

In some languages, like JavaScript, PHP, etc. Boolean values are given as true and false. In Python, these values are capitalised and it's necessary to capitalise them if you wish to use them.

At least for now, you won't find Booleans any useful. It's only once we get the hang of control-flow structures like while, for, if etc. that the significance of Booleans will become apparent.

Lists

Lists in Python are an extremely useful data type. They represent a sequence of values that can be of any type.

Consider how lists work in real life - we have items one after another in an ordered manner. This is just how lists work in Python.

To create a list, we start by writing a pair of square brackets []. Inside this pair we put the items of the list, also known as the elements of the list. Each new item is separated from the previous one using a , comma

Below shown is a simple example:

odds = [1, 3, 5]

The variable odds is a list of three elements, all integers (and odd numbers).

Each item in a list is at a specific position. This position is formally referred to as an index.

The first element is at index 0, the second one is index 1, the third is at index 2 and so on.

To access a given element of a list we ought to use its index.

First comes the name of the list, followed by a pair of [] square brackets and then within these brackets, the index of the element we wish to be retrieved.

Let's access the first and third elements of the list odds:

odds[0]
1
odds[2]
5

The first element is at index 0 and so we write odds[0] to access it. The same goes for the third element.

List indexes can only be integers, nothing else - not even floats!

We'll learn more about lists including the syntax of creating a list, the concept of list comprehensions, dimensions of a list, how to loop over a given list, sorting lists, and much much more in the Python Lists unit.

Tuples

In mathematics, a tuple is simply an ordered collection of numbers denoted using a pair of () parentheses. The following are examples of tuples.

(1, 2), (0, 1, 2), (1.2, 3.7)

Lists aren't the only way to store sequences of data in Python - it provides another data type to serve this purpose and that is tuples.

Generally tuples behave exactly like lists except for the fact that they are immutable i.e we can't change a tuple's value once it has been defined.

Creating a tuple in Python follows the same syntax as creating a tuple in mathematics - write a pair of () parentheses and then within these parentheses, put the individual items of the tuple, separated by a , comma.

Below we create a tuple holding the first 3 odd numbers:

odds_tuple = (1, 3, 5)

To access items in a tuple we use the same index logic as we did in the case of lists and strings; since tuples are also sequences.

odds_tuple[0]
1

Sets

A great deal of mathematics utilises the concept of set theory. Sets are unordered collections of data that usually meets a given property (although it is not necessary to).

In Python, the set data type is exactly based on sets in mathematics. It's denoted in the same way, it works in the same way - it just does everything in the same way!

To create a set, we start with a pair of {} curly braces. Inside these, we put the elements of the set, separated from one another using the same old , comma character.

Below we create a set s holding the first 5 non-negative even numbers:

s = {0, 2, 4, 6, 8}

Remember that a set is unorderd in nature, which means that we can't just access any of its elements using an index. There is no concept of indexes in sets!

Being unordered in nature also means that the two sets {0, 1} and {1, 0} are equal to one another. Let's compare these in real:

{0, 1} == {1, 0}
True

The == double equals sign here denotes the equality operator.

The equality operator compares two values and returns True if they are equal to one another; or otherwise False.

In the snippet above, True was returned by the given equality operation which confirms the fact that Python considers {0, 1} and {1, 0} as identical sets.

In fact, any two sets, that hold the same elements be they in any order, are considered equal to one another.

In the Python Sets unit, we'll explore how to perform set operations on Python sets. These include intersection, union, difference, symmetric difference; checking whether a set is a subset or superset of another set; and much more.

Dictionaries

If you want to store labeled information of a given object in one place, then a dictionary is your way to go.

A dictionary is an unordered collection of key-value pairs. A key is usually a characteristic of the object the dictionary represents and a value is its corresponding value.

Creating a dictionary is superbly easy...

Start with a pair of {} curly braces and then inside these, put the key-value pairs separated by a , comma. A key-value pair is formed as follows: write the key, followed by a : colon, and finally write the value that belongs to this key.

Dictionary keys can be strings, integers, or tuples. However, in most cases they are strings.

The general syntax of a dictionary can be represented as:

{key1: value1, key2: value2, ....}

Consider the code below:

item = {'category': 'Dairy', 'name': 'Eggs', 'price': 1.2}

Notice how the dictionary item here models a real world item in a grocery store i.e a box of eggs. The keys represent properties of the item such as its category. its price; whereas the values represent their corresponding values, obviously.

Dictionaries in Python are made for this purpose - they can encapsulate labeled data of a given item.

However, there isn't any necessity that you have to use a dictionary for only this purpose - you can use it for other cases as well.

One is highlighted below:

students = {'maths': 60, 'chemistry': 56, 'physics': 31}

The dictionary students here shows how many students are enrolled in each subject offered at an institute.

Notice that the dictionary does not denote a real world item here whose properties are 'maths', 'physics' or 'chemistry'. Rather, it's just a convenient name for us to denote how many students are enrolled in a particular subject.

We'll learn more about dictionaries in the Python Dictionaries unit.

More types

The list of data types in Python doesn't end here. All the ones that we've mentioned above are pretty basic and so got the chance to be put up in this chapter.

There is a decent amount of other data types such as classes, modules, functions, bytearrays etc. left to be discovered in the late segment of this course.

For now, getting hang of these elementary data types is important for you so that you can get more fluid in working with Python and as a result become more confident for some concepts you'll learn in the coming chapters.