HTML Metadata

Chapter 10 42 mins

Learning outcomes:

  1. What is metadata
  2. A recap of the <title> element
  3. The <meta> element
  4. The <link> element

Introduction

HTML documents consist of two discrete pieces of data, one is data that describes the document and one is data that is the actual content of the document. The former goes into the <head> element while the latter goes into the <body> element.

In this chapter, we'll go over some of the most common HTML elements that go inside <head> and are used to describe the underlying document's content, sometimes referred to as metadata elements.

What is metadata?

Let's start by defining what exactly is metadata.

Metadata is data that describes data.

For an HTML document, an instance of metadata could be the language of the document.

Clearly, the language isn't the actual content of the document; it instead describes that content. And that's essentially why we term it as metadata.

The idea of metadata exists in HTML but it's not unique to HTML; it exists in other technologies as well. For example, HTTP requests and responses carry header data that is essentially metadata describing the actual transmitted data.

In HTML, all metadata is given inside the <head> element.

The three main elements used to specify the metadata for an HTML document are as follows: <title>, <meta>, and <link>.

There are other metadata elements as well, such as <style> and <script> but we shall explore them in later chapters.

Let's start by reviewing <title> before moving over to consider the other two elements.

The <title> element

As we learnt back in the HTML Basics chapter, the <title> element is used to denote a document's title, which shows up in the browser's tab panel and in search engines.

Shown below are two illustrations to help visualize this.

First, here's where <title> shows up in the browser's tab panel:

The title of a webpage, shown on the tab panel.
The title of a webpage, shown on the tab panel.

And here's where it shows up in search engine results pages (in this case, Google):

Searching for 'codeguage' on Google.
Searching for 'codeguage' on Google.

As this illustration demonstrates, <title> holds immense value in SEO (Search Engine Optimization), which is basically the factors that lead to better rankings of sites in search engines.

That's because the contents of <title> are used by search engines to index for search keywords, and then serve the underlying pages when similar keywords are entered by searchers.

In short, make sure to have <title> on every single HTML page.

Coming back to the discussion, <title> is a container element (with a starting and ending tag), containing the information that forms the underlying document's title.

Here's an example using <title>:

<!DOCTYPE>
<html>
<head>
<title>Titles are important</title> </head> <body> <h1>Titles are important</h1> <p>Always remember to add a title to every single of your HTML documents.</p> </body> </html>

Live Example

Open up the linked page and notice the document's title as displayed in the browser's tab panel.

<title> doesn't entertain HTML elements!

It's extremely important to note that even though we can put HTML elements inside <title>, there's no sense in that (the title shows up in the browser's tab panel and that is just all text). Plus, <title> doesn't even render those elements any way.

So, for example, if we do the following, thinking that the word 'HTML' would be formatted bold in search engines, that's absolutely not going to be the case:

<!DOCTYPE>
<html>
<head>
<title>No <em>HTML</em> elements in titles!</title> </head> <body> <h1>No <em>HTML</em> elements in titles!</h1> </body> </html>

<title> is meant to throw away every single HTML element put inside it.

The <meta> element

The <meta> element is one of the cornerstones of specifying the metadata of an HTML document.

There are essentially three variations of <meta> elements, distinguished by its attributes:

  • With the charset attribute.
  • With the name attribute, along with content.
  • With the http-equiv attribute, along with content.

In the second variation, where we have the name attribute on a <meta> element, we can have different elements representing different metadata aspects of a document.

In the following discussion, we go over some of the most popular <meta> kinds.

Document's character set

The world of text in computing is pretty complex, yet theoretically absolutely basic. One of the most important aspects is that of character sets, along with its intrinsically related idea of character encodings.

Throughout the history of computers, many character sets and encodings have evolved.

Today, for the World Wide Web, perhaps the most popular character set is Unicode, spanning millions of characters and symbols from languages all over the world.

To get a quick beginners guide on Unicode and what exactly is its use, refer to Technical Quick Start Guide - Unicode.

Unicode itself has many possible character encodings. UTF-8 uses a minimum of 8 bits to encode every single textual character, UTF-16 uses 16 bits, and UTF-32 uses 32 bits.

As you can probably guess, UTF-8 is the most efficient encoding of these three when it comes to encoding characters, since it'll consume the least number of bytes. And that's exactly why it's the de facto for the web, in particular for HTML5.

In HTML, to specify the character encoding of the HTML document, we use the charset attribute of the <meta> element.

Three things worth noting here:

  1. By default, if such a <meta> element is omitted from the document, HTML5 uses the UTF-8 character encoding.
  2. According to the HTML Standard, it's recommended for charset to only have the value "utf-8".
  3. Once again, according to the standard, if a charset <meta> is to be given, this must be done near the very start of the <head> element. Ideally, try to reserve the first spot inside <head> for <meta charset="charset">.

Let's consider an example to help clarify these points.

In the following code, we explicitly set the character set in the HTML code:

<!DOCTYPE>
<html>
<head>
<meta charset="utf-8"> <title>Setting the character set</title> </head> <body> <h1>Setting the character set</h1> <p>Working with the <code>charset</code> attribute of <code>meta</code>.</p> </body> </html>

Notice how the the <meta> element appears first inside the <head> element — it's not strictly required but definitely a really good thing to do.

While it's possible to omit a charset <meta> element, it's considered a good practice to have it in an HTML document in order to be more explicit about the character encoding of the underlying document (even though it's implicitly already taken to be UTF-8).

When could we really need <meta charset="charset">?

Honestly speaking, these days, there isn't much, if not any, need of another character encoding for HTML documents.

UTF-8 pretty nicely and efficiently (in terms of bytes) covers a jaw-dropping span of languages and symbols from diverse fields of study, and almost all modern-day HTML documents would be perfect with it.

In that way, we might only really ever need charset if, let's say, we're working on a legacy system or maybe with some special HTML document based on a different character set, like ISO-8859-1.

Document's description

Using <meta>, we can specify a concise description of an HTML document.

This is done by setting the name attribute of the element to the value description and then further setting the content attribute to the description text.

Sometimes, if search engines can't find a reasonable description of a page to show in their SERPs (Search Engine Results Pages), they might fallback to using <meta name="description">

Now should you include <meta name="description"> in your HTML pages?

Well, if it's easy enough for you, then definitely go for it. But if the opposite holds, it's perfectly alright to skip it (search engines these days are quite intelligent and can automatically deduce a reasonable description text out of the document).

Let's see some basic examples.

In the following code, we demonstrate how to set a description <meta>:

<head>
   <meta name="description" value="This is the description text">
</head>

The value of name is "description" and the value of content is the description text.

As a more pratical usage, consider the following code, where we display the description <meta> of this very chapter:

<meta name="description" value="Learn what exactly is metadata in HTML, exploring the <title> element, the different kinds of <meta> meta elements, and <link> elements.">

See how the description quickly summarizes the content of the current chapter.

There's really not much to term as right or wrong in description <meta>s. But if we were to be picky, there are two things worth noting:

  • Try not to exceed the 160-character limit because search engines typically hard-limit page descriptions in their SERPs to these many characters. You don't want trimmed off descriptions, do you?
  • Try to come up with captivating descriptions. In case search engines do use your description, you ideally want to make the most of that opportunity. Write a compelling description in order to grab the user's attention so that he/she clicks on your page's link.

Viewport

When the time will come to design responsive websites for mobile devices, you'll benefit from your knowledge of the viewport <meta> element.

The viewport <meta> element specifies certain aspects related to the document's viewport. Before we can understand what it offers us, we ought to understand about the concept of the viewport.

A viewport is essentially the entire egion where a webpage gets rendered.

For browsers, the viewport excludes the top and left menu bars, if any. It's simply just the area where the webpage gets displayed.

Different devices obviously have different viewport dimensions. Things become really interesting when we talk about mobile devices specifically, where we get a unique but undesirable feature called the virtual viewport.

Let's say we're viewing a webpage on a mobile device that's 600px wide (we're not concerned with the height). By default, what mobile browsers do to get the entire content to fit the screen is to set a particularly large virtual viewport and then scale it down to the dimensions of the screen.

Without the virtual viewport, we would have a pretty small viewport to render the contents of a webpage, assuming that the webpage hasn't been optimized for mobile devices. There sure is a need for a virtual viewport, relative to which all the content of the webpage gets laid out, and which itself gets scaled down for the sake of fitting into the screen.

The problem is this virtual viewport.

If a webpage has been designed for mobile devices as well, using responsive design principles, then it's a complete waste of all that designing effort to use virtual viewports on mobile devices. The ideal behavior is to set the viewport's dimensions to the screen from the very beginning.

And that's what the viewport <meta> is there for.

Using the viewport <meta>, which is simply <meta name="viewport">, we can configure the viewport on mobile devices.

The value of this <meta>, which obviously goes into the content attribute (since we're using the name attribute), can contain multiple directives, separated by commas (,).

These directives are listed below:

  • width — specifies the width of the viewport. Possible values are numbers ranging from 1 to 10000 or the special value device-width, to extend the viewport to the device's width.
  • height — specifies the height of the viewport. Possible values are numbers ranging from 1 to 10000 or the special value device-height, to extend the viewport to the device's height.
  • initial-scale — specifies the initial scale factor (i.e. zoom level) of the viewport. Possible values range from 0.1 (zoomed-out) to 10 (zoomed-in). The default is 1, which is the normal scale.
  • minimum-scale — specifies the minimum scale factor possible for the viewport. Possible values range from 0.1 to 10. The default is 0.1.
  • maximum-scale — specifies the maximum scale factor possible for the viewport. Possible values range from 0.1 to 10. The default is 10.
  • user-scalable — specifies whether the pinch-zoom gesture works or not in order to scale up or down the viewport. Possible values are 0 (or no) and 1 (or yes).

So should we explore some examples now?

In the following HTML page, we have a paragraph containing a pretty long text so as to help us immediately visualize the width of the viewport.

<!DOCTYPE>
<html>
<head>
   <title>Experimenting with the viewport</title>
</head>
<body>
   <h1>Experimenting with the viewport</h1>
   <p>This is a really really really long paragraph so that we can easily visualize how large the viewport, the region where the webpage gets rendered on the browser, is. The longer the viewport, the longer would this paragraph's lines be. The viewport is a pretty simple concept.</p>
</body>
</html>

Live Example

The longer the viewport, the longer the length of a line of the paragraph.

Now, let's slightly modify the HTML by introducing a viewport <meta>:

<!DOCTYPE>
<html>
<head>
   <title>Experimenting with the viewport</title>
<meta name="viewport" content="width=device-width"> </head> <body> <h1>Experimenting with the viewport</h1> <p>This is a really really really long paragraph so that we can easily visualize how large the viewport, the region where the webpage gets rendered on the browser, is. The longer the viewport, the longer would this paragraph's lines be. The viewport is a pretty simple concept.</p> </body> </html>

Live Example

Let's see the difference it brings.

As you can see, this time the paragraph remains within the bounds of the screen simply because that's how wide our viewport is.

While we've laid out an example, let's also try the other two directives, i.e. initial-scale and user-scalable.

In the following code, we use a double scale factor for the viewport:

<!DOCTYPE>
<html>
<head>
   <title>Experimenting with the viewport</title>
<meta name="viewport" content="width=device-width, initial-scale=2"> </head> <body> <h1>Experimenting with the viewport</h1> <p>This is a really really really long paragraph so that we can easily visualize how large the viewport, the region where the webpage gets rendered on the browser, is. The longer the viewport, the longer would this paragraph's lines be. The viewport is a pretty simple concept.</p> </body> </html>

Live Example

When we open up the page above, we see that the viewport has been scaled up by a factor of 2. Its width is still the same as the device's width, it's just that we're in a zoomed-in state now.

In the following code, we disable pinch-zoom by setting user-scalable to 0:

<!DOCTYPE>
<html>
<head>
   <title>Experimenting with the viewport</title>
<meta name="viewport" content="width=device-width, user-scalable=0"> </head> <body> <h1>Experimenting with the viewport</h1> <p>This is a really really really long paragraph so that we can easily visualize how large the viewport, the region where the webpage gets rendered on the browser, is. The longer the viewport, the longer would this paragraph's lines be. The viewport is a pretty simple concept.</p> </body> </html>

Live Example

If you're on a mobile touch device, try pinch-zooming the webpage linked above — it would ignore the gesture since the user isn't allowed to scale the webpage.

user-scalable largely only applies to mobile and tablet devices. That is, if user-scalable=0 is set, pinch-zoom won't be ignored on a touch-enabled desktop device.

Let's consider another example, this time demonstrating minimum-scale and maximum-scale:

<!DOCTYPE>
<html>
<head>
   <title>Experimenting with the viewport</title>
<meta name="viewport" content="width=device-width, minimum-scale=0.5, maximum-scale=2"> </head> <body> <h1>Experimenting with the viewport</h1> <p>This is a really really really long paragraph so that we can easily visualize how large the viewport, the region where the webpage gets rendered on the browser, is. The longer the viewport, the longer would this paragraph's lines be. The viewport is a pretty simple concept.</p> </body> </html>

Live Example

In the webpage linked, try zooming in — you'll notice the zoom being stopped at a scale factor of 2, by virtue of being configured by maximum-scale.

A fact worth pointing out in this example is that minimum-scale doesn't have any effect, i.e. we can't zoom out less than the scale factor 1. That's simply because the initial scale with which the webpage opens up is 1.

If, let's say, we change the initial-scale to 0.5, things will change

Don't disallow zooming for accessibility!

There is one exceptionally crucial thing to note regarding the user-scalable and maximum-scale settings.

That is, setting user-scalable=0 and/or a maximum-scale less than 2.0 is NOT recommended as per the guideliens of WAI (Web Accessibility Initiative).

According to WAI, users should be allowed to zoom in text on small, mobile devices. This especially holds for users with low vision who need to zoom in the text to clearly visualize it.

http-equiv

Besides specifying a document's metadata, the <meta> element can also be used to specify pragma directives.

Now what is a pragma directive?

Well, it's a general concept from programming:

A pragma directive is a name commonly given to any commandment in a language that influence the way it is evaluated.

In the case of HTML, a pragma directive (in an HTML document) essentially is a command to the browser, influencing the way how the underlying document is evaluated.

It's important to keep in mind that a pragma directive does NOT specify a document's metadata; it's purely an instruction meant for the browser executing the HTML code.

Coming back to the <meta> element, when it has an http-equiv attribute on it, it represents a pragma directive.

The http-equiv attribute on a <meta> element makes the element a pragma directive.

The value of the http-equiv attribute can be either of the following:

  • "content-language"
  • "content-type"
  • "defaul-style"
  • "refresh"
  • "x-ua-compatible"
  • "set-cookie"
  • "content-security-policy"

The actual value of the pragma directive is laid out in the content attriubute.

Perhaps, the most common usage of http-equiv is via the value "refresh".

http-equiv="refresh" is used to automatically refresh the current document to itself, or optionally to another resource, after a certain amount of time elapses following the load of the underlying document. It's the complement pragma directive to the HTTP Refresh header.

When the refresh directive is used to reload to the same document, it has the following syntax:

<meta http-equiv="refresh" content="timeInSeconds">

timeInSeconds is an integer denoting the time in seconds to wait before performing the refresh. For example, content="5" means to wait for 5 seconds.

But when it is used to reload to another document, the following syntax applies:

<meta http-equiv="refresh" content="timeInSeconds; url=resourceURL">

url= is the additional expression here that comes after the time, following a semicolon (;); resourceURL is the actual URL of the resource to refresh to.

Let's consider some examples.

In the following HTML code, we have a <meta> element with http-equiv="refresh" set, with a value of "5" which means to refresh to the same document after 5 seconds:

<!DOCTYPE html>
<html>
<head>
   <title>Demonstrating http-equiv</title>
   <meta http-equiv="refresh" content="5">
</head>
<body>
   <h1>Demonstrating http-equiv</h1>
   <p>This page will refresh in 5 seconds.</p>
</body>
</html>

Live Example

This means that after 5 seconds, the document will refresh automatically.

Once 5 seconds ellapse and the document refreshes, the document is evaluated all over again by the browser, just as in a normal refresh. This means that following this first refresh, once the document is completely evaluated and displayed, a new refresh timer begins. In other words, the document will keep on refreshing at 5 second intervals.

As stated earlier, refresh can also be used to redirect the document to another document. This is illustrated below:

<!DOCTYPE html>
<html>
<head>
   <title>Demonstrating http-equiv</title>
   <meta http-equiv="refresh" content="4; url=/">
</head>
<body>
   <h1>Demonstrating http-equiv</h1>
   <p>This page will redirect to our homepage in 4 seconds.</p>
</body>
</html>

Live Example

We have the content attribute specifying a time delay of 4 seconds and the URL of our homepage. After 4 seconds, the browser will reload the window of the document to our homepage.

This latter usage of http-equiv="refresh" is mostly done on temporary webpages.

For example, when a user logs into an account, he/she might be shown a temporary page reading something like "Redirecting to dashboard in 3 seconds."

The reason for sending back an HTML document containing a refresh pragma directive here might be because perhaps some external third-party script needs to know of the request and process it on its end separately. Using an HTML document containing a refresh directive, this is possible because then the script loads on the HTML page and gets some time to do its job.

Besides this, the exact application of http-equiv="refresh" obviously depends upon the underlying web app, but at least we now know one way to redirect the browser to the same page or another page using some good old HTML.

Moving on, if you're wondering about the nomenclature 'http-equiv', read the following snippet, for it contains the answer.

What is meant by the name http-equiv?

HTTP is the protocol that the World Wide Web uses to transfer hypertext and other hypermedia. It stands for HyperText Transfer Protocol. HTTP uses messages to send information back and forth between a client and a server.

These messages are comprised of a header section, containing some metadata for the message, and a body section, containing the actual data of the message.

The header section itself consists of a list of headers, known as HTTP headers, each representing a different aspect of the underlying message, for e.g. the type of data it's carrying, the length of its data, the last modified date of the data, and so on and so forth.

Now, the term ''http-equiv'' stands for 'http equivalent'.

In this respect, http-equiv specifies the equivalent directive of an HTTP header inside an HTML file. This means that, more or less, every http-equiv pragma directive in HTML has a corresponding HTTP header.

For instance, one possible value of http-equiv is "content-type" and so we have a Content-Type HTTP header.

The <link> element

Next in line, for specifying the metadata of an HTML document, we have <link>

The <link> element is used to link another resource with the current HTML document.

There can be different kinds of links in this regard. Some popular ones include linking to a stylesheet, or to an icon, or maybe even specifying an alternate representation of the document.

The kind of relation that a given <link> has with the current document is indicated by the rel attribute (which stands for 'relation') of the element.

The most frequently used values for rel are:

  • "stylesheet" — links to a CSS stylesheet.
  • "icon" — links to an icon image.
  • "alternate" — refers to an alternate representation of the current document.

Let's very quickly go over each of these values.

Stylesheets

One of the most common application of <link> is to link to external stylesheets.

A stylesheet is simply a CSS file that defines the styles of the underlying HTML document where it's linked.

All but the simplest of websites use CSS stylesheets for styling their HTML content.

If we wish to link in a stylesheet on an HTML document, we have to use the <link> element with the rel attribute set to the value "stylesheet" and the href attribute set to the URL of the stylesheet.

Pretty simple.

Let's say we have the following stylesheet, named style.css:

style.css
body {
   background-color: yellow;
}

What's shown here is some CSS code that basically styles the <body> element by giving it a yellow background color.

And now suppose that in the same directory as the style.css file, we have an HTML file called style.html containing the following:

style.html
<!DOCTYPE>
<html>
<head>
   <meta charset="utf-8">
   <title>Working with CSS</title>
</head>
<body>
   <h1>Working with CSS</h1>
   <p>CSS is absolutely amazing!</p>
</body>
</html>

Then we can link in the style.css stylesheet very easily using <link>, setting its href to a relative URL. This is demonstrated as follows:

style.html
<!DOCTYPE>
<html>
<head>
   <meta charset="utf-8">
   <title>Working with CSS</title>
<link rel="stylesheet" href="style.css"> </head> <body> <h1>Working with CSS</h1> <p>CSS is absolutely amazing!</p> </body> </html>

Live Example

If we open up this page, we see that its background color is yellow. The credit goes to the <link> element and the stylesheet that it links in.

Simply amazing!

We'll discover stylesheets in detail in the HTML Metadata — Stylesheets chapter.

Icon

Have you ever noticed the small icon that appears on the tab panel on browsers for given websites? Or maybe the icon that appears alongside bookmarked webpages?

Shown below is an example for our own website:

Favicon on a browser's tab panel
Favicon on a browser's tab panel

This is called a favicon.

A favicon is an icon, a small one typically, that appears on tab panels, bookmark entries, and other suchlike places for a given webpage.

To set the favicon for a website, we use the <link> element, with the rel attribute set to the value "icon" and the href attribute set to the URL of the icon.

Let's consider a simple example.

For the purposes of this example, we'll use the following icon, saved as a PNG image with the name favicon.png (you can also download it by right-clicking on it and then saving it):

An icon meant to be used as a favicon

Here's a simplistic HTML setup, where we'll eventually set up a favicon:

<!DOCTYPE>
<html>
<head>
   <meta charset="utf-8">
   <title>Working with favicon</title>
</head>
<body>
   <h1>Working with favicon</h1>
   <p>Adding a favicon is pretty simple!</p>
</body>
</html>

Live Example

Now, let's add the favicon to it:

<!DOCTYPE>
<html>
<head>
   <meta charset="utf-8">
   <title>Working with favicon</title>
<link rel="icon" href="favicon.png"> </head> <body> <h1>Working with favicon</h1> <p>Adding a favicon is pretty simple!</p> </body> </html>

Live Example

Open up the HTML page and try to notice the icon that shows up on the tab panel — it'll be the favicon.png icon shown up.

We'll learn a lot more about <link rel="icon"> in the HTML Metadata — Icons chapter.

Alternate representation

Suppose we have a webpage at the URL www.example.com. Further suppose that there is a mobile variant of this same page, accessible at m.example.com.

In this case, we can specify m.example.com as an alternate representation in the HTML page residing at www.example.com, and vice versa.

For such cases, we require the <link> element with the rel attribute set to the value "alternate" and its href attribute pointing to the alternate resource.

Using the type attribute, whose value ought to be a MIME type, we can further classify the type of the alternate resource.

For example, if the alternate resource is a plain text, then type would be "text/plain". Similarly, if the alternate resource is a PDF file, type would be "application/pdf".

There is a large collection of MIME types maintained by the IANA. The collection can be seen in the following document: Media Types - IANA.

Time for an example.

In the following code, we specify an alternate version of our document, greeting.txt, which is a plain text file:

<!DOCTYPE>
<html>
<head>
   <meta charset="utf-8">
   <title>Hello World!</title>
<link rel="alternate" type="text/plain" href="greeting.txt"> </head> <body> <h1>Hello World!</h1> <p>Just learning HTML.</p> </body> </html>

Note that here we're supposing that greeting.txt (shown below) exists in the same directory as this HTML page, hence the href set to "greeting.txt".

Here's how the greeting.txt file looks:

greeting.txt
Hello World!

Just learning HTML.

Pretty easy, isn't it?

"I created Codeguage to save you from falling into the same learning conundrums that I fell into."

— Bilal Adnan, Founder of Codeguage