PHP Numbers Basics

Chapter 13 35 mins

Learning outcomes:

  1. A quick recap of numbers in PHP
  2. Internal representation of integers
  3. Internal representation of floats
  4. Using underscores (_) in number literals
  5. The E notation
  6. Special numbers — INF and NAN
  7. Integer division and the intdiv() function
  8. Type conversion functions intval() and floatval()
  9. Checking if a float is an integer

Introduction

In the PHP Foundation unit, we covered a lot about PHP including some of the very basic aspects of working with numbers. We saw the two types of numbers in the language i.e. integers and floats, in addition to ideas related to each of these types.

In this chapter, we aim to take all that knowledge one step further and digest even more ideas in the world of numbers in PHP. In particular, here we'll learn about how are integers and floats represented internally in memory, how to work with the special float values INF and NAN, and a lot more number-related functions such as intval(), floatval(), intdiv() and so on.

We'll also explore the scientific notation used to denote a float in PHP in addition to the recently-added underscore symbol to improve the readability of long integers.

In short, this chapter will take us way ahead in fluency while working with numbers in PHP, which is a must-to-have skill for every single computer programmer.

So let's begin.

A quick recap

Let's start with a quick recap of what we already know about numbers in PHP.

There are two kinds of numbers in PHP — integers and floats. The distinction between these is fairly easy to understand — one doesn't have a fractional part whereas the other does.

A numeric literal that doesn't have a decimal point (.) in it is considered to be an integer. However, a literal with a decimal point is considered a float.

Some examples of integers are -1000, -29, 0, 2 and 50000 while some examples of floats are -102.3, -2.323423, 0.0, 2.0000023 and -234234.99090.

To convert a given value to an integer, we can use the (int) typecast as shown below:

<?php

$a = '10';

var_dump($a);
var_dump((int) $a);
string(2) "10" int(10)

Similarly, to convert a given value to a float, we can use the (float) typecast, as shown below:

<?php

$a = '10.1';

var_dump($a);
var_dump((float) $a);
string(4) "10.1" int(10.1)

Both integers and floats support all of the most common arithmetic operations such as addition, subtraction, multiplication, division, exponentiation, and modulo.

Finally, to check whether a given value is an integer, we use the is_int() function and similarly to check if the value is a float, we use the is_float() function.

In the snippet below, we perform the is_int() check on a handful of values:

<?php

var_dump(is_int(10));
var_dump(is_int(10.0));
var_dump(is_int('10'));
var_dump(is_int(true));
bool(true) bool(false) bool(false) bool(false)

And in the snippet below, we perform the is_float() check on the same values:

<?php

var_dump(is_float(10));
var_dump(is_float(10.0));
var_dump(is_float('10'));
var_dump(is_float(true));
bool(false) bool(true) bool(false) bool(false)

And this is it for the recap.

Internal representation of integers

PHP is a sufficiently high-level language that only provides one single type for an integer unlike languages such as C, C++ where we have a multitude of types to represent an integer each having a different range.

An integer in PHP consumes 4 bytes of memory on a 32-bit machine whereas 8 bytes on a 64-bit machine. Moreover, all integers in PHP are signed. The format used internally is the typical format used across all modern languages to implement integers i.e. 2's complement.

Here's how an integer in PHP would look in memory:

Internal representation of 64-bit integers in 2's complement
Internal representation of 64-bit integers in 2's complement

The leftmost bit denotes the sign of the integer, together with a bias. Each bit has a power of 2 associated with it as mentioned above. The final integer is determined by adding all these powers wherever the bit is 1.

Hence, 37 is represented as follows:

Representation of 37 in memory, in 2's complement
Representation of 37 in memory, in 2's complement

Similarly, -128 is represented as follows, keeping in mind that the leftmost 1 bit doesn't just represent the - sign but rather a value of -9223372036854775808 which has to be added with a positive value to make the end result -128.

Representation of -128 in memory, in 2's complement
Representation of -128 in memory, in 2's complement

Consequently, under this format, the minimum and maximum integers on a 32-bit machine are -2147483648 and 2147483647 respectively, while -9223372036854775808 and 9223372036854775807 on a 64-bit machine, respectively.

The constants PHP_INT_MIN and PHP_INT_MAX hold the minimum and maximum integers on the current machine:

<?php

echo 'Min: ', PHP_INT_MIN, "\n";
echo 'Max: ', PHP_INT_MAX, "\n";

Our machine is 64-bit, hence we get the following:

Min: -9223372036854775808 Max: 9223372036854775807

So this is a little glimpse into the internals of integers in PHP.

Frankly speaking, we don't need to know about this representation as far as working in the language is concerned. It's just an internal detail that's helpful and useful as side knowledge.

Internal representation of floats

Floats in PHP are based on the IEEE-754 double-precision floating-point format. This is a fairly standard format used across almost all the mainstream languages to denote floating-point numbers.

In this format, each float consumes 8 bytes of memory. These 8 bytes, or 64 bits, are segmented into three groups, each denoting a certain aspect of the float. Before we can make sense of this segmentation, we ought to understand scientific notation.

Scientific notation, also known as standard form, or exponential notation, is a standard way to represent very small or very large numbers nearly in mathematics, physics, chemistry, biology and other scientific disciplines.

It is comprised of three parts — a sign, followed by a significand, followed by a power of 10.

The significand is typically written with one digit before the decimal point (if there is a decimal point). The base of the power is usually ::10:: in our calculations.

So far example, ::105.6:: can be expressed as ::1.056 \times 10^2::. Similarly, ::-30.37:: would be expressed as ::-3.037 \times 10^1:: in scientific notation.

This is scientific notation in the decimal number system. In the realms of computers, however, we need a binary system. Fortunately, it's also very easy to devise a binary system for scientific notation. Instead of raising ::10:: to the given exponent, we raise the integer ::2::. Moreover, each digit in the significand represents a power of ::2::, not a multiple of a power of ::10::.

Hence, ::2.5:: would be represented as ::10.1 \times 2^0:: (i.e ::2 + 0.5::) in binary scientific notation. Similarly, ::-50.25:: would be represented as ::-1.1001001 \times 2^5:: (i.e. the number ::-110010.01:: after multiplying with the power ::2^5::).

IEEE-754 uses this scientific notation to represent floats in memory. Each number is broken down into three segments — a sign, an exponent of ::2::, and a significand.

The sign is alloted 1 bit (the leftmost bit), the exponent is alloted the next 11 bits, and the last 52 bits are alloted to the significand. Now these are just part of the details of the format. If we dig a little more deeper, there are many many other things to consider such the representation of the special numbers INF and NAN, exponent bias, extra digit appended to the significand, and much more.

At least at this stage, we don't need all this in-depth information. Just a little knowledge would be more than enough in appreciating the level of mathematics that goes into building numeric systems on computing machines.

Anyways, moving on, the maximum number possible in this format is approximately ::1.8 \times 10^{308}:: while the minimum is ::-1.8 \times 10^{308}::.

The maximum float can be retrieved via the constant PHP_FLOAT_MAX, whereas the minimum can be retrieved by simply negating PHP_FLOAT_MAX, as shown below:

<?php

echo 'Min: ', -PHP_FLOAT_MAX, "\n";
echo 'Max: ', PHP_FLOAT_MAX;
Min: -1.7976931348623E+308 Max: 1.7976931348623E+308

Simple.

In addition to this, the most precise number is somewhere close to ::4.9 \times 10^{324}::.

And this is it for the internals of floating-point numbers in PHP.

Using underscores (_) in literals

Since PHP 7.4.0, underscores (_) have been added to the language in order to separate digits from one another in a numeric literal.

When parsing code, these underscores are removed before the next stage by the underlying engine, so they are just a syntactic sugar in the language.

Let's see a quick example. Suppose we want to denote the number ::1\,000\,560\,356:: in PHP.

In the code below, we represent this number in two ways — one without underscores and one with them:

<?php

$num = 1000560356;
$readable_num = 1_000_560_356;

Which one seems more readable? Clearly the second one.

As stated before, these underscores are removed when parsing the code. This can be confirmed by outputting the integers:

<?php

$num = 1000560356;
$readable_num = 1_000_560_356;

echo $num, "\n";
echo $readable_num;
1000560356 1000560356

As can be seen, both the output numbers are exactly equal to one another.

Moving on, note that when using underscores, make sure to not leave any underscores at either end of the numeric literal. Both can lead to an error. Even adding two underscores next to each other is a syntax error.

For instance, in the code below, we get a semantic error thrown because of _ at the start of 1_000:

<?php

$num = _1_000;
echo $num;
PHP Fatal error: Uncaught Error: Undefined constant "_1_000" in <path>:3 Stack trace: #0 {main} thrown in <path> on line 3

PHP interprets _1_000 as a constant as it begins with an underscore (_). And since it can't find such a constant, it throws an error.

The error above is not a syntax error.

Similarly, in the code below, adding two underscores after one another leads to a syntax error:

<?php

$num = 1__000;
echo $num;
PHP Parse error: syntax error, unexpected identifier "__000" in <path> on line 3

The E notation

In the discussion above regarding the internal representation of floats in PHP, we learnt about the scientific notation.

Directly representing floats in this way in PHP, and nearly all popular languages, is possible via the E notation.

The E symbol denotes a power of 10 to which to multiply a given number with.

Note that E could also be written as e i.e. it's case-insensitive.

The general syntax of E is as follows:

<number>E<exponent>

<number> is the number to multiply with a power of 10, whereas <exponent> is the number to which 10 is raised. The <exponent> could have a sign (+ or -) as well.

This is equivalent to <number> x 10 <exponent> in other words.

Let's see how to use E to represent 156.2 in PHP:

<?php

// Represent 156.2 in E notation
$num = 1.562E+2;

echo $num;
156.2

The significand here is 1.562, while the exponent is +2.

Using the + sign in the exponent improves the readability of the literal. You can verify this yourself — which of the following is more readable: 1.562E2 and 1.562E+2.

Let's try using a negative exponent:

<?php

// Represent 0.0356 in E notation
$num = 3.56E-2;

echo $num;
0.0356

Amazing!

As we can see in both the output snippets above, PHP expands the exponential notation by multiplying the given significand with the given power and then printing the resulting number.

However, this only holds upto a limit beyond which PHP simply outputs the number in E notation.

An example follows:

<?php

echo 18.3E+300;

Here the number 18.3E+300 is extremely large for PHP to expand when printing. Hence, it falls back with printing the number in exponential notation.

1.83E+301

Also notice how PHP automatically normalizes the significand by shifting the decimal point by one position to the left (i.e. 1.83E+301 instead of 18.3E+300) to resemble the typical scientific notation in mathematics.

Special numbers

Following from the IEEE-754 format that PHP uses to represent floats internally, there are two special floating-point numbers in the language — INF and NAN.

Both these numbers are available as global constants.

Let's explore them one-by-one...

INF

We'll start with INF.

The special number INF in PHP is used to represent infinity i.e. something beyond calculation.

Creating a float that's larger than the maximum value ≈ 1.8 x 10308 or lesser than the minimum value ≈ -1.8 x 10308 results in INF and -INF, respectively.

Shown below are two examples for INF:

<?php

$num = 2E+500;
echo $num, "\n";

$num = 1.8E+308;
echo $num;

Both the numbers 2E+500 and 1.8E+308 are above the maximum value capable of being stored in PHP, and likewise boil down to INF.

INF INF

Similarly, below we have two examples for -INF:

<?php

$num = -2E+500;
echo $num, "\n";

$num = -1.8E+308;
echo $num;

-2E+500 and -1.8E+308 are below the minimum value capable of being stored in PHP, and so boil down to -INF.

-INF -INF

Even adding two INF values leads to INF.

<?php

echo INF + INF;
INF

This is a sensible result since adding two infinite values can never give a finite result!

NAN

Apart from INF, NAN is another special kind of a number.

The special number NAN stands for 'not a number', and is used to represent a numeric result that can't really be defined.

The most typical operation giving NAN in the end is the subtraction of INF with INF. As can be reasoned, this operation can't really be defined numerically, hence NAN is used to represent the result:

<?php

echo INF - INF;
NAN

One of the most surprising things regarding NAN is that two NAN values aren't considered equal to one another:

<?php

var_dump(NAN === NAN);
bool(false)

This isn't a rule specific to PHP — the IEEE-754 specification defines this behavior itself.

Now there are multiple reasons for choosing this behavior as defined in the spec, but at least for us, we don't need to worry a lot about the exact reasons. We should, at most, know about this behavior and how to actually test whether a given number is NAN.

The function is_nan() can be used to determine whether a given value is NAN.

is_nan($value)

$value is the value to test against NAN. If it's NAN, the function returns true, or else false.

Consider the code below:

<?php

var_dump(is_nan(NAN));
var_dump(is_nan(INF - INF));
var_dump(is_nan(0));
var_dump(is_nan(INF + INF));
bool(true) bool(true) bool(false) bool(false)

NAN and INF - INF both are NAN values, hence the first two is_nan() calls yield true. However, 0 and INF + INF (which gives INF) are both non-NAN values and likewise is_nan() yields false in the last two calls.

Integer division

Most statically-typed languages provide different semantics for the division operation when performed on two integers. That is, the result is also an integer rather than a float which can otherwise hold on to the fraction in the result.

Such a kind of division is generally known as integer division.

In PHP, as we know, the division operator (/) does NOT work this way. That is, if the result of a division of two integers is a float, then the operation returns a float.

An illustration follows:

<?php

$num = 3 / 2;
echo $num;
1.5

We know that 3 divided by 2 gives 1.5, and that's exactly what's stored in $num here.

Thus, to restate it, division in PHP via / on two integers is not integer division.

However, if we want such a kind of division, it's very very simple — just cast the resulting number to an integer.

Consider the following:

<?php

$num = (int) (3 / 2);
echo $num;
1

As can be seen, (int) (3 / 2) yields 1. The 1.5 returned by (3 / 2) is casted to an integer by (int) resulting in the fraction .5 being thrown away from 1.5.

The parentheses wrapped around (3 / 2) here are important. Without them, the cast won't work as expected. That is (int) 3 / 2 would first cast 3 to an integer (which it already is) and then divide it by 2 to give 1.5. With the parentheses, first 3 / 2 is performed to give 1.5 and then this value is cast to an integer to give 1.

Another way to perform this same operation is to use the intdiv() function.

intdiv($a, $b)

As with the normal division via /, intdiv() takes two values in the same order as in a normal division, and then returns the result of performing integer division over the two numbers.

Below shown is the same example as before, this time using intdiv():

<?php

$num = intdiv(3, 2);
echo $num;
1
Great!

Type conversion functions

We've already seen the (int) and (float) casts from the previous part of this course — they both don't need an introduction.

What's interesting to know is that PHP also provides two functions that do the exact same thing with exactly the same semantics i.e. intval() and floatval(), respectively.

Both of the functions require just a single argument which is the value to coerce.

Here's an example:

<?php

$value = '50';
var_dump($value);
var_dump(intval($value)); $value = '50.1'; var_dump($value);
var_dump(floatval($value));
string(2) "50" int(50) string(4) "50.1" float(50.1)

Now since both these functions operate exactly like the typecasts (int) and (float), you might have one question in you mind right now...

Type conversion functions vs. typecasts in PHP

So what really is the difference between (int) and intval() and similarly between (float) and floatval() in PHP?

What exactly is the point of providing two ways to perform the exact same operation?

Well, this is a really good question with a really simple answer.

Both intval() and floatval() are functions and hence could be passed to another function to convert given values to integers or floats, respectively. This doesn't hold for the casts (int) and float() i.e. they can NOT be passed in to functions.

A very good example of a case where we might want to pass intval() or floatval() to another function is when using array_map().

We'll cover the details of array_map() later in the PHP Arrays unit, but to discuss it briefly, it allows us to map an array to another array based on a mapping function.

For instance, we could map the array [1, 2, 3, 4] to [1, 4, 9, 16] using a function that returns the square of a given number. Similarly, we could map ['1', '2', '3', '4'] to [1, 2, 3, 4] using the intval() function.

This aint' possible using (int).

Checking if a float is an integer

Although it's not a highly common operation, at some point while working with floats, we might want to determine if it has a fractional part of zero.

For instance, 10.2 fails this check since it has a non-zero fractional part of .2. However, 10.0 meets it since it has a zero fractional part i.e .0.

The question is how to perform this check?

Well PHP doesn't provide a native function or construct to do so. However, it's completely possible to accomplish this operation using the existing features we know from PHP.

Let's see if you can come up with it.

Time to tacke this problem...

To determine if a float has a fractional part of zero, we can perform the following steps:

  1. First, cast the float to an integer.
  2. Then, cast the integer back to a float.
  3. Finally, compare the result to the original float using ===. If the comparison passes, the float is an integral number.

The reason why this approach works is because when a float in an integral number, i.e. has a fractional part of zero, casting to an integer (which throws off the fractional part) would cause no precision loss in the float and then casting back this integer to a float would restore the original value.

Thus, both the initial and final values would be identical to one another.

Let's try and test this:

<?php

$f = 10.5;
var_dump($f === (float) (int) $f);

$f = 10.0;
var_dump($f === (float) (int) $f);
bool(false) bool(true)

Note that instead of casting the resulting integer with (float) and then comparing $f with the casted value using the identity operator (===), we can use the equality operator (==).

Behind the scenes, == does the same thing as we're manually trying to do over here i.e. when one operand is a float and the other is an integer, the integer is automatically casted to a float before performing the comparison.

The following code is the same as before, just this time we use == instead of the identity operator (===) and skip the explicit (float) cast:

<?php

$f = 10.5;
var_dump($f == (int) $f);

$f = 10.0;
var_dump($f == (int) $f);
bool(false) bool(true)

The result is, and will always be, exactly the same.

Simple, wasn't this?

Limitations of this approach

When using this or possibly other approaches to check whether a given float actually represents an integral number, we should be aware of certain limitations.

For instance, consider the number 5E+300. We know that it's an integral number with many many zeroes. However, the same approach as before doesn't recognize it as such:

<?php

$num = 5E+300;

var_dump($num === (float) (int) $num);
bool(false)

But why?

The thing is that when a floating-point number is cast to an integer in PHP, if the float is larger than PHP_INT_MAX, then it's cast to the value PHP_INT_MIN. Then, when this PHP_INT_MIN value is cast back to a float, once again some precision is lost in the casting.

In the end, the original float is not at all equal to the double-casted float and hence PHP doesn't recognize it as an integral number.

Long story short, this idea of determining whether a float represents an integral number is only defined upto a certain limit.

Beyond that limit, only undefined behavior occurs due to the casting of extremely large or extremely small numbers back and forth from the integer and float types that result in a loss of precision and, sometimes, of the original values.

"I created Codeguage to save you from falling into the same learning conundrums that I fell into."

— Bilal Adnan, Founder of Codeguage