Introduction
In the PHP Foundation unit, we covered a lot about PHP including some of the very basic aspects of working with numbers. We saw the two types of numbers in the language i.e. integers and floats, in addition to ideas related to each of these types.
In this chapter, we aim to take all that knowledge one step further and digest even more ideas in the world of numbers in PHP. In particular, here we'll learn about how are integers and floats represented internally in memory, how to work with the special float values INF
and NAN
, and a lot more number-related functions such as intval()
, floatval()
, intdiv()
and so on.
We'll also explore the scientific notation used to denote a float in PHP in addition to the recently-added underscore symbol to improve the readability of long integers.
In short, this chapter will take us way ahead in fluency while working with numbers in PHP, which is a must-to-have skill for every single computer programmer.
So let's begin.
A quick recap
Let's start with a quick recap of what we already know about numbers in PHP.
There are two kinds of numbers in PHP — integers and floats. The distinction between these is fairly easy to understand — one doesn't have a fractional part whereas the other does.
A numeric literal that doesn't have a decimal point (.
) in it is considered to be an integer. However, a literal with a decimal point is considered a float.
Some examples of integers are -1000
, -29
, 0
, 2
and 50000
while some examples of floats are -102.3
, -2.323423
, 0.0
, 2.0000023
and -234234.99090
.
To convert a given value to an integer, we can use the (int)
typecast as shown below:
<?php
$a = '10';
var_dump($a);
var_dump((int) $a);
Similarly, to convert a given value to a float, we can use the (float)
typecast, as shown below:
<?php
$a = '10.1';
var_dump($a);
var_dump((float) $a);
Both integers and floats support all of the most common arithmetic operations such as addition, subtraction, multiplication, division, exponentiation, and modulo.
Finally, to check whether a given value is an integer, we use the is_int()
function and similarly to check if the value is a float, we use the is_float()
function.
In the snippet below, we perform the is_int()
check on a handful of values:
<?php
var_dump(is_int(10));
var_dump(is_int(10.0));
var_dump(is_int('10'));
var_dump(is_int(true));
And in the snippet below, we perform the is_float()
check on the same values:
<?php
var_dump(is_float(10));
var_dump(is_float(10.0));
var_dump(is_float('10'));
var_dump(is_float(true));
And this is it for the recap.
Internal representation of integers
PHP is a sufficiently high-level language that only provides one single type for an integer unlike languages such as C, C++ where we have a multitude of types to represent an integer each having a different range.
An integer in PHP consumes 4 bytes of memory on a 32-bit machine whereas 8 bytes on a 64-bit machine. Moreover, all integers in PHP are signed. The format used internally is the typical format used across all modern languages to implement integers i.e. 2's complement.
Here's how an integer in PHP would look in memory:
The leftmost bit denotes the sign of the integer, together with a bias. Each bit has a power of 2
associated with it as mentioned above. The final integer is determined by adding all these powers wherever the bit is 1
.
Hence, 37
is represented as follows:
Similarly, -128
is represented as follows, keeping in mind that the leftmost 1
bit doesn't just represent the -
sign but rather a value of -9223372036854775808
which has to be added with a positive value to make the end result -128
.
Consequently, under this format, the minimum and maximum integers on a 32-bit machine are -2147483648
and 2147483647
respectively, while -9223372036854775808
and 9223372036854775807
on a 64-bit machine, respectively.
The constants PHP_INT_MIN
and PHP_INT_MAX
hold the minimum and maximum integers on the current machine:
<?php
echo 'Min: ', PHP_INT_MIN, "\n";
echo 'Max: ', PHP_INT_MAX, "\n";
Our machine is 64-bit, hence we get the following:
So this is a little glimpse into the internals of integers in PHP.
Frankly speaking, we don't need to know about this representation as far as working in the language is concerned. It's just an internal detail that's helpful and useful as side knowledge.
Internal representation of floats
Floats in PHP are based on the IEEE-754 double-precision floating-point format. This is a fairly standard format used across almost all the mainstream languages to denote floating-point numbers.
In this format, each float consumes 8 bytes of memory. These 8 bytes, or 64 bits, are segmented into three groups, each denoting a certain aspect of the float. Before we can make sense of this segmentation, we ought to understand scientific notation.
Scientific notation, also known as standard form, or exponential notation, is a standard way to represent very small or very large numbers nearly in mathematics, physics, chemistry, biology and other scientific disciplines.
It is comprised of three parts — a sign, followed by a significand, followed by a power of 10.
The significand is typically written with one digit before the decimal point (if there is a decimal point). The base of the power is usually ::10:: in our calculations.
So far example, ::105.6:: can be expressed as ::1.056 \times 10^2::. Similarly, ::-30.37:: would be expressed as ::-3.037 \times 10^1:: in scientific notation.
This is scientific notation in the decimal number system. In the realms of computers, however, we need a binary system. Fortunately, it's also very easy to devise a binary system for scientific notation. Instead of raising ::10:: to the given exponent, we raise the integer ::2::. Moreover, each digit in the significand represents a power of ::2::, not a multiple of a power of ::10::.
Hence, ::2.5:: would be represented as ::10.1 \times 2^0:: (i.e ::2 + 0.5::) in binary scientific notation. Similarly, ::-50.25:: would be represented as ::-1.1001001 \times 2^5:: (i.e. the number ::-110010.01:: after multiplying with the power ::2^5::).
IEEE-754 uses this scientific notation to represent floats in memory. Each number is broken down into three segments — a sign, an exponent of ::2::, and a significand.
The sign is alloted 1 bit (the leftmost bit), the exponent is alloted the next 11 bits, and the last 52 bits are alloted to the significand. Now these are just part of the details of the format. If we dig a little more deeper, there are many many other things to consider such the representation of the special numbers INF
and NAN
, exponent bias, extra digit appended to the significand, and much more.
At least at this stage, we don't need all this in-depth information. Just a little knowledge would be more than enough in appreciating the level of mathematics that goes into building numeric systems on computing machines.
Anyways, moving on, the maximum number possible in this format is approximately ::1.8 \times 10^{308}:: while the minimum is ::-1.8 \times 10^{308}::.
The maximum float can be retrieved via the constant PHP_FLOAT_MAX
, whereas the minimum can be retrieved by simply negating PHP_FLOAT_MAX
, as shown below:
<?php
echo 'Min: ', -PHP_FLOAT_MAX, "\n";
echo 'Max: ', PHP_FLOAT_MAX;
Simple.
In addition to this, the most precise number is somewhere close to ::4.9 \times 10^{324}::.
And this is it for the internals of floating-point numbers in PHP.
Using underscores (_
) in literals
Since PHP 7.4.0, underscores (_) have been added to the language in order to separate digits from one another in a numeric literal.
When parsing code, these underscores are removed before the next stage by the underlying engine, so they are just a syntactic sugar in the language.
Let's see a quick example. Suppose we want to denote the number ::1\,000\,560\,356:: in PHP.
In the code below, we represent this number in two ways — one without underscores and one with them:
<?php
$num = 1000560356;
$readable_num = 1_000_560_356;
Which one seems more readable? Clearly the second one.
As stated before, these underscores are removed when parsing the code. This can be confirmed by outputting the integers:
<?php
$num = 1000560356;
$readable_num = 1_000_560_356;
echo $num, "\n";
echo $readable_num;
As can be seen, both the output numbers are exactly equal to one another.
Moving on, note that when using underscores, make sure to not leave any underscores at either end of the numeric literal. Both can lead to an error. Even adding two underscores next to each other is a syntax error.
For instance, in the code below, we get a semantic error thrown because of _
at the start of 1_000
:
<?php
$num = _1_000;
echo $num;
PHP interprets _1_000
as a constant as it begins with an underscore (_
). And since it can't find such a constant, it throws an error.
Similarly, in the code below, adding two underscores after one another leads to a syntax error:
<?php
$num = 1__000;
echo $num;
The E
notation
In the discussion above regarding the internal representation of floats in PHP, we learnt about the scientific notation.
Directly representing floats in this way in PHP, and nearly all popular languages, is possible via the E
notation.
E
symbol denotes a power of 10
to which to multiply a given number with.Note that E
could also be written as e
i.e. it's case-insensitive.
The general syntax of E
is as follows:
<number>E<exponent>
<number>
is the number to multiply with a power of 10
, whereas <exponent>
is the number to which 10
is raised. The <exponent>
could have a sign (+
or -
) as well.
This is equivalent to <number>
x 10 <exponent>
in other words.
Let's see how to use E
to represent 156.2
in PHP:
<?php
// Represent 156.2 in E notation
$num = 1.562E+2;
echo $num;
The significand here is 1.562
, while the exponent is +2
.
+
sign in the exponent improves the readability of the literal. You can verify this yourself — which of the following is more readable: 1.562E2
and 1.562E+2
.Let's try using a negative exponent:
<?php
// Represent 0.0356 in E notation
$num = 3.56E-2;
echo $num;
Amazing!
As we can see in both the output snippets above, PHP expands the exponential notation by multiplying the given significand with the given power and then printing the resulting number.
However, this only holds upto a limit beyond which PHP simply outputs the number in E
notation.
An example follows:
<?php
echo 18.3E+300;
Here the number 18.3E+300
is extremely large for PHP to expand when printing. Hence, it falls back with printing the number in exponential notation.
Also notice how PHP automatically normalizes the significand by shifting the decimal point by one position to the left (i.e. 1.83E+301
instead of 18.3E+300
) to resemble the typical scientific notation in mathematics.
Special numbers
Following from the IEEE-754 format that PHP uses to represent floats internally, there are two special floating-point numbers in the language — INF
and NAN
.
Both these numbers are available as global constants.
Let's explore them one-by-one...
INF
We'll start with INF
.
INF
in PHP is used to represent infinity i.e. something beyond calculation.Creating a float that's larger than the maximum value ≈ 1.8 x 10308
or lesser than the minimum value ≈ -1.8 x 10308
results in INF
and -INF
, respectively.
Shown below are two examples for INF
:
<?php
$num = 2E+500;
echo $num, "\n";
$num = 1.8E+308;
echo $num;
Both the numbers 2E+500
and 1.8E+308
are above the maximum value capable of being stored in PHP, and likewise boil down to INF
.
Similarly, below we have two examples for -INF
:
<?php
$num = -2E+500;
echo $num, "\n";
$num = -1.8E+308;
echo $num;
-2E+500
and -1.8E+308
are below the minimum value capable of being stored in PHP, and so boil down to -INF
.
Even adding two INF
values leads to INF
.
<?php
echo INF + INF;
This is a sensible result since adding two infinite values can never give a finite result!
NAN
Apart from INF
, NAN
is another special kind of a number.
NAN
stands for 'not a number', and is used to represent a numeric result that can't really be defined.The most typical operation giving NAN
in the end is the subtraction of INF
with INF
. As can be reasoned, this operation can't really be defined numerically, hence NAN
is used to represent the result:
<?php
echo INF - INF;
One of the most surprising things regarding NAN
is that two NAN
values aren't considered equal to one another:
<?php
var_dump(NAN === NAN);
This isn't a rule specific to PHP — the IEEE-754 specification defines this behavior itself.
Now there are multiple reasons for choosing this behavior as defined in the spec, but at least for us, we don't need to worry a lot about the exact reasons. We should, at most, know about this behavior and how to actually test whether a given number is NAN
.
The function is_nan()
can be used to determine whether a given value is NAN
.
is_nan($value)
$value
is the value to test against NAN
. If it's NAN
, the function returns true
, or else false
.
Consider the code below:
<?php
var_dump(is_nan(NAN));
var_dump(is_nan(INF - INF));
var_dump(is_nan(0));
var_dump(is_nan(INF + INF));
NAN
and INF - INF
both are NAN
values, hence the first two is_nan()
calls yield true
. However, 0
and INF + INF
(which gives INF
) are both non-NAN
values and likewise is_nan()
yields false
in the last two calls.
Integer division
Most statically-typed languages provide different semantics for the division operation when performed on two integers. That is, the result is also an integer rather than a float which can otherwise hold on to the fraction in the result.
Such a kind of division is generally known as integer division.
In PHP, as we know, the division operator (/
) does NOT work this way. That is, if the result of a division of two integers is a float, then the operation returns a float.
An illustration follows:
<?php
$num = 3 / 2;
echo $num;
We know that 3
divided by 2
gives 1.5
, and that's exactly what's stored in $num
here.
Thus, to restate it, division in PHP via /
on two integers is not integer division.
However, if we want such a kind of division, it's very very simple — just cast the resulting number to an integer.
Consider the following:
<?php
$num = (int) (3 / 2);
echo $num;
As can be seen, (int) (3 / 2)
yields 1
. The 1.5
returned by (3 / 2)
is casted to an integer by (int)
resulting in the fraction .5
being thrown away from 1.5
.
(3 / 2)
here are important. Without them, the cast won't work as expected. That is (int) 3 / 2
would first cast 3
to an integer (which it already is) and then divide it by 2
to give 1.5
. With the parentheses, first 3 / 2
is performed to give 1.5
and then this value is cast to an integer to give 1
.Another way to perform this same operation is to use the intdiv()
function.
intdiv($a, $b)
As with the normal division via /
, intdiv()
takes two values in the same order as in a normal division, and then returns the result of performing integer division over the two numbers.
Below shown is the same example as before, this time using intdiv()
:
<?php
$num = intdiv(3, 2);
echo $num;
Type conversion functions
We've already seen the (int)
and (float)
casts from the previous part of this course — they both don't need an introduction.
What's interesting to know is that PHP also provides two functions that do the exact same thing with exactly the same semantics i.e. intval()
and floatval()
, respectively.
Both of the functions require just a single argument which is the value to coerce.
Here's an example:
<?php
$value = '50';
var_dump($value);
var_dump(intval($value));
$value = '50.1';
var_dump($value);
var_dump(floatval($value));
Now since both these functions operate exactly like the typecasts (int)
and (float)
, you might have one question in you mind right now...
Type conversion functions vs. typecasts in PHP
So what really is the difference between (int)
and intval()
and similarly between (float)
and floatval()
in PHP?
What exactly is the point of providing two ways to perform the exact same operation?
Well, this is a really good question with a really simple answer.
Both intval()
and floatval()
are functions and hence could be passed to another function to convert given values to integers or floats, respectively. This doesn't hold for the casts (int)
and float()
i.e. they can NOT be passed in to functions.
A very good example of a case where we might want to pass intval()
or floatval()
to another function is when using array_map()
.
We'll cover the details of array_map()
later in the PHP Arrays unit, but to discuss it briefly, it allows us to map an array to another array based on a mapping function.
For instance, we could map the array [1, 2, 3, 4]
to [1, 4, 9, 16]
using a function that returns the square of a given number. Similarly, we could map ['1', '2', '3', '4']
to [1, 2, 3, 4]
using the intval()
function.
This aint' possible using (int)
.
Checking if a float is an integer
Although it's not a highly common operation, at some point while working with floats, we might want to determine if it has a fractional part of zero.
For instance, 10.2
fails this check since it has a non-zero fractional part of .2
. However, 10.0
meets it since it has a zero fractional part i.e .0
.
The question is how to perform this check?
Well PHP doesn't provide a native function or construct to do so. However, it's completely possible to accomplish this operation using the existing features we know from PHP.
Let's see if you can come up with it.
Time to tacke this problem...
To determine if a float has a fractional part of zero, we can perform the following steps:
- First, cast the float to an integer.
- Then, cast the integer back to a float.
- Finally, compare the result to the original float using
===
. If the comparison passes, the float is an integral number.
The reason why this approach works is because when a float in an integral number, i.e. has a fractional part of zero, casting to an integer (which throws off the fractional part) would cause no precision loss in the float and then casting back this integer to a float would restore the original value.
Thus, both the initial and final values would be identical to one another.
Let's try and test this:
<?php
$f = 10.5;
var_dump($f === (float) (int) $f);
$f = 10.0;
var_dump($f === (float) (int) $f);
Note that instead of casting the resulting integer with (float)
and then comparing $f
with the casted value using the identity operator (===
), we can use the equality operator (==
).
Behind the scenes, ==
does the same thing as we're manually trying to do over here i.e. when one operand is a float and the other is an integer, the integer is automatically casted to a float before performing the comparison.
The following code is the same as before, just this time we use ==
instead of the identity operator (===
) and skip the explicit (float)
cast:
<?php
$f = 10.5;
var_dump($f == (int) $f);
$f = 10.0;
var_dump($f == (int) $f);
The result is, and will always be, exactly the same.
Simple, wasn't this?
Limitations of this approach
When using this or possibly other approaches to check whether a given float actually represents an integral number, we should be aware of certain limitations.
For instance, consider the number 5E+300
. We know that it's an integral number with many many zeroes. However, the same approach as before doesn't recognize it as such:
<?php
$num = 5E+300;
var_dump($num === (float) (int) $num);
But why?
The thing is that when a floating-point number is cast to an integer in PHP, if the float is larger than PHP_INT_MAX
, then it's cast to the value PHP_INT_MIN
. Then, when this PHP_INT_MIN
value is cast back to a float, once again some precision is lost in the casting.
In the end, the original float is not at all equal to the double-casted float and hence PHP doesn't recognize it as an integral number.
Long story short, this idea of determining whether a float represents an integral number is only defined upto a certain limit.
Beyond that limit, only undefined behavior occurs due to the casting of extremely large or extremely small numbers back and forth from the integer and float types that result in a loss of precision and, sometimes, of the original values.