Thus far in the PHP Strings unit, we've covered a lot of solid ground related to strings in the language. Now, let's expand upon that knowledge and learn about string formatting — another highly important topic, one that'll make you an aficianado when working with strings.
In this chapter, we shall take a look over a collection to some PHP string functions that allow us to work with format strings, i.e. strings with special values inside them, referred to as format specifiers, that are parsed for extracting data out or melding data in.
If you're coming from a C background, this chapter will be peaches and cream for you, owing to the fact that C programmers routinely use a similar concept when retrieving input or producing output (the
Why we need format strings?
Often times while creating complex apps, it's required to output a complex piece of text that's made up of many individual pieces of data.
For instance, consider the following code:
<?php $id = '48xp-89aa-93ks-00s4'; $name = 'Wooden table'; $price = 39.99; echo '(' . $id . ') ' . $name . ' ----- $' . $price;
We are trying to output a single line of text, yet the
echo statement looks quite intimidating, thanks to the numerous concatenation operations.
If we think about it for a moment, such pieces of text can be much clearly and neatly expressed if we work with placeholders for actual data. For instance, the output desired above could be expressed as follows,
(<id>) <name> ----- <price>
<price> are all placeholders (that have to be filled with actual data).
Software specifications relating to text-based input/output are always laid out in this manner, i.e. as templates with placeholders.
No one would ever go like: "start off with the id, enclosed in parentheses, followed by a space, followed by the name of the product, and then ..." — this seems totally senseless.
In PHP, as we already know, by virtue of interpolation of double-quoted strings, we can express variables in strings. So the same code above can be expressed as follows:
<?php $id = '48xp-89aa-93ks-00s4'; $name = 'Wooden table'; $price = 39.99; echo "($id) $name ----- $$price";
Clearly, this is much better.
But wait. If interpolates strings do the job, then why are we even here? There must be some reason to learn about format strings, shouldn't it?
Well, there is.
The interpolated string above only allows us to express what piece of data we want in a given position in the string — it doesn't allow us to go beyond that, expressing the format of each individual piece of data.
Let's take the example of the variable
$price in the example above. In the string, we use it as it is; there's no rounding performed to given number of decimal places as is typically done when representing money. More than probably, in the specification of the output for this case, we might've as well been informed about the desired precision of the price (typically 2 d.p.).
Now if we continue to stick to the interpolated string example above, we won't be able to express this requirement directly in the string.
This is where format strings enter the game.
What are format strings?
Format strings essentially allow us to express the general format of input or output text.
They do so by using placeholders, also referred to as format specifiers. A format specifier is denoted using the
% character, followed by a range of other things as we shall learn below.
Since a format string is just a string, it can be denoted in any way that a normal string could be denoted in PHP. That is, we can use single quotes (
'), double quotes (
"), or even the heredoc (or nowdoc) syntax. The only special character in format strings is the
% character — the exact syntax of denoting the string doesn't matter.
A format specifier ends with a letter denoting the type of the value that it represents. This letter is referred to as the specifier (not format specifier).
The table below lists possible specifiers:
|Represents the literal |
|Represents a decimal integer.|
|Represents a binary integer.|
|Represents an octal integer.|
|Represents a hexadecimal integer, with lowercase letters.|
|Represents a hexadecimal integer, with uppercase letters.|
|Represents a float.|
|Represents a float, whose precision is dealt in terms of significant figures.|
|Represents a character.|
|Represents a string.|
So, for instance, the placeholder
%d represents a decimal integer,
%f denotes a float,
%s denotes a string, and so on.
Besides the type, there's a host of other things that we could include in a format specifier. The general format of a format specifier could de expressed as follows:
argnumspecifies the number of the argument that the placeholder refers to. This will only make sense once we consider the string functions where format strings are used.
flagsdescribes the padding to apply to the placeholder, including the pad character. For numbers, it also specifies whether or not to precede them with the
+sign in case they are positive.
widthspecifies the width of the placeholder (in number of characters).
precisionspecifies the precision for a number or the cut-off length for a string.
specifier(as we've seen above) indicates the type of the placeholder. Based on the type, the corresponding data might get automatically coerced to fit the desired type.
) here are used to imply that the underlying token is optional.
By default, when formatting a given datum, it's right-aligned, and in the case of numbers, it's only followed by its sign if it's a negative number (which is just how we represent numbers normally in maths).
To override this behavior, we can use certain flags, denoted above as
flags. The table below lists the possible flags:
|Left-align the actual data to fit the given |
|Precede the number with its sign, even for positive numbers.|
|Pad with |
|Pad with the given character |
|Pad with the the space character.|
Keep in mind that flags are optional (apparent by the square brackets surrounding them in the syntax above).
width is an integer that specifies the minimum number of characters to use for the placeholder.
precision follows a
. character and specifies the precision of the corresponding value. In the case of an
f specifier, it represents the precision in terms of decimal places; in the case of a
g specifier, it represents the precision in terms of significant figures. In the case of an
s specifier, it represents the cut-off length of the string.
PHP provides us with a handful of string functions to work with these format strings.
Now, these functions can be divided into two categories:
- One where we go from data to text.
- One where we go from text to data.
In the former category, we have the following functions:
The common phrase 'printf' in each of these functions' name hints to us that they work more or less the same way, printing data to a given entity (which might be the standard output stream, a given string, or a given file).
In the latter category, we have the following functions:
fscanf(). Once again, the common phrase 'scanf' in each of the functions' name hints to us that they work more or less the same way, scanning an entity containing some text and then extracting data out of it.
Let's start by exploring the former...
From data to text
fprintf() all allow us to work with format strings in order to go from a given set of data to a given piece of text.
That piece of text, depending on the function used, might be printed to standard output (as is the case with
echo), to a string variable, or to a file.
Let's commence the exploration with the
printf() function is used to output some formatted text to the standard output stream (more on streams later in this course).
In simpler words, it's similar to our old friend
echo, just with some spectacular formatting capabilities.
echois a language construct while
printf()is a function.
Here's the syntax of
printf($format, $arg_1, $arg_2, ..., $arg_n)
The first argument is the format string which specifies the general format of the output. Each subsequent argument thereafter is data to be plugged into this format string.
Now, it's time to consider some real examples of working with format strings and
printf(), in particular.
Let's first see how to accomplish the task that we accomplished above using an interpolated string.
To recall the task, given an id, a product name, and its price, we have to output all of these in the following format:
(<id>) <name> ----- $<price>.
With a format string, this could be accomplished as follows:
<?php $id = '48xp-89aa-93ks-00s4'; $name = 'Wooden table'; $price = 39.99; printf('(%s) %s ----- $%f', $id, $name, $price);
Here's what the format string says:
$id is a string, likewise it's represented as
$name is also a string, likewise it's also represented as
$price is a float (not an integer), likewise it's represented as
Great. But notice the formatting of the product's price; it's a little bit more than what we want.
By default, the
f specifier produces a precision of 6 decimal places. In order to reduce this down to 2 d.p, we ought to use the
precision parameter in the corresponding format specifier.
Let's do this now:
<?php $id = '48xp-89aa-93ks-00s4'; $name = 'Wooden table'; $price = 39.99; printf('(%s) %s ----- $%.2f', $id, $name, $price);
Here's how the format specifier
%.2f works. The
.2 specifies the precision of the given value (because we are working with a float here,
.2 refers to a precision of 2 d.p). The
f tells us that the placeholder represents a float value.
fgives meaning to
Let's try another example.
Suppose we want to output each of these three pieces of information in tabular format (in the terminal). First, obviously, to give meaning to each row of the table, we need a header row. So let's create that first.
But before that, let's make some assumptions:
- The product's ID won't be any longer 20 characters.
- The product's name won't be any longer than 30 characters.
- The product's price won't be any longer than 10 characters.
We'll use these maximum lengths to pad each column to the respective length. Also note that in order to keep the overall example simple, we'll refrain from creating horizontal and vertical lines in the table (using the
As stated before, let's start off with the header:
<?php $id = '48xp-89aa-93ks-00s4'; $name = 'Wooden table'; $price = 39.99; printf('%-20s %-30s %-10s', 'ID', 'Name', 'Price ($)');
Let's make intuition of the format specifier
%-20s; a similar reasoning can be applied to the rest of the format specifiers.
%-20s has the
- flag, a width of
20, and the
s specifier. The
- flag serves to align the text to the left (and apply the padding to the right). The width of
20, obtained using the assumptions above, is necessary to reserve a large area for the ID in each row. Finally, the
s specifier is used because the argument
'ID' is a string.
Now, let's head over to the first (and only) data row of the table:
<?php $id = '48xp-89aa-93ks-00s4'; $name = 'Wooden table'; $price = 39.99; printf('%-20s %-30s %-10s', 'ID', 'Name', 'Price ($)'); echo "\n"; printf('%-20s %-30s %-10.2f', $id, $name, $price);
In both the
printf() calls here, because the first and second piece of data, corresponding to the first and second format specifiers, respectively, are both strings, the first and second format specifiers remain the same in the format strings.
This, however, isn't the case with the third format specifier.
In the first
printf() call, we have a string (
'Price ($)') and, likewise, a format specifier meant for a string (i.e.
%-10s). In the second
printf() call, we have a float, and thus modify the format specifier slightly — changing the
s specifier to an
f, and also configuring the precision of the float (via
So what do you say? Is this simple or not?
Let's experiment a little more, and with this same example.
In the following code, we drop the
- flag from each and every format specifier, just to see the difference it makes:
<?php $id = '48xp-89aa-93ks-00s4'; $name = 'Wooden table'; $price = 39.99; printf('%20s %30s %10s', 'ID', 'Name', 'Price ($)'); echo "\n"; printf('%20s %30s %10.2f', $id, $name, $price);
And here's that difference:
See how the padding is applied on the left and the text is aligned to the right? This is the default formatting behavior which we modified above using the
Let's now consider the
From the perspective of the format string, the
sprintf() function works exactly like
printf(). But from the perspective of functionality, it's different...a little bit different.
The 's' in
sprintf stands for 'string'. Consequently,
sprintf means to print inside a string. This is how
sprintf() differs from
printf() — the latter prints to standard output while the former just produces a string with the desired format.
sprintf($format, $arg_1, $arg_2, ..., $arg_n)
printf() returns the length of the printed string while
sprintf() returns the string itself.
sprintf()does NOT print anything by itself! It only returns a string.
Let's consider an example.
In the code below, we obtain a formatted string using
sprintf() and then echo it out:
<?php $id = '48xp-89aa-93ks-00s4'; $name = 'Wooden table'; $price = 39.99; $formatted_str = sprintf('(%s) %s ----- $%.2f', $id, $name, $price); echo $formatted_str;
In addition to
fprintf() is yet another function that allows us to go from a given set of data to some formatted text.
fprintf() dumps the produced text inside a given file. That's apparent by the 'f' in the function's name — it stands for 'file'.
fprintf() deals with files, we'll cover it later on in this course once we explore files in PHP in detail.
Each of the three functions
fprintf() have analog functions that operate with data in the form of a single array instead of as multiple arguments.
These functions are
The 'v' in each of these function's name stands for 'vector'. If you have experience with C++, a vector is basically just an array. Likewise,
vprintf() means that it's
printf() that works with a vector (i.e. an array).
Consider the following example where we demonstrate how
vprintf() differs from
<?php $id = '48xp-89aa-93ks-00s4'; $name = 'Wooden table'; $price = 39.99; vprintf('(%s) %s ----- $%.2f', [$id, $name, $price]);
printf(), we have to pass each concrete piece of data to the function individually, as a separate argument. However, with
vprintf(), we can put all of the data inside an array and then just pass the array to the function.
This might be really handy if all that we have to work with is an array holding the data. We don't have to go very far to match this with a real-world analogy.
For example, when retrieving a record from a database, we might want to get back an indexed array, holding each field of the record in the order in which the underlying query was made. Using,
vsprintf() and the returned array, we can directly format the record into a string, without having to manually access each field and pass it over to
Text to data
Besides going from data to text, PHP also provides us with certain functions to go from text to data. These functions parse the text for a given format and then extract data out of it.
To name them, we have
sscanf() function is meant to parse a given string based on a given format and then extract data out of it.
sprintf(), the 's' in
sscanf() stands for 'string', but here it doesn't meant that the function returns a string like
sscanf() takes in the text in the form of a string argument.
Here's the signature of
sscanf($string, $format[, $arg_1, $arg_2, ..., $arg_n])
The first argument is the string to parse for data. The second argument is the format string.
From this point onwards, we can either pass on variables as (reference) arguments to the function in order to get them populated with the respective data, or let the function collect all the data in the form of an array and then return it.
If no argument is provided after
$format, we get an array in return holding the extracted data. Otherwise, the extracted data is dumped into the provided variables in the same order that it was expressed in the format string.
Format strings for input text work differently!
One important thing to note when working with
sscanf() (and even
fscanf()) is that the format string passed to it has a different parsing ruleset applied to it.
For instance, the format string
'%.2f' is invalid if we pass it to
sscanf(). This is evident in the following illustration:
<?php $str = '50.99'; // Can't use %.2f in a format string describing input data! sscanf($str, '%.2f', $value);
For those coming from a C background, this behavior of
sscanf() is definitely different from that of
sscanf() in C.
Let's consider a quick example to understand this.
Suppose we have a string with the phrase 'Price: $' followed by a float (for instance, 'Price: $15.99'). Our job is to extract out the exact price from this string.
Well, thanks to
sscanf(), this job is pretty easy to accomplish, as demonstrated below:
<?php $str = 'Price: $15.99'; sscanf($str, 'Price: $%f', $price); var_dump($price);
As you can see, a variable
$price is passed as an argument to
sscanf() here so that the function could dump the extracted float into it. The
var_dump() call clearly shows us that, indeed, the variable holds a float.
If we were to change either the main string (i.e.
$str) or the format string and keep the other as before, the extraction would fail to be correct.
This can be seen as follows:
<?php $str = 'Price: $15.99'; sscanf($str, 'Price: %f', $price); var_dump($price);
We've removed the
$ sign preceding the
%f specifier in the format string, while
$str is exactly the same as before. Now, if we run this code,
$price doesn't turn out to be what we expect it to:
And it shouldn't rightly so. The
%f specifier only represents a floating-point number (optionally, with spaces preceding it); it doesn't represent
$ prices. Likewise, the format string here isn't compliant with the main string
$str and therefore
Surely, though, if we remove the
$ from the main string as well, our example would resume its expected behavior:
<?php $str = 'Price: 15.99'; sscanf($str, 'Price: %f', $price); var_dump($price);
Time for another example.
In the following code, we have a string consisting of a word, followed by a space, followed by an integer, representing the word's count in a large piece of text.
<?php $str = 'awesome 37';
Once again, with the help of
sscanf(), it's really simple to extract both these pieces of data from the string — just use an
%s followed by a
<?php $str = 'awesome 37'; sscanf($str, '%s %d', $word, $count); var_dump($word); var_dump($count);
And that's awesome!
fscanf() function allows us to extract data out from a file based on a given format. Akin to
fprintf(), the 'f' in the function's name stands for 'file'.
As before, because
fscanf() also deals with files, we'll cover it later on in this course once we explore files in PHP in detail.