HTML Advanced Text

Chapter 6 30 mins

Learning outcomes:

  1. Preformatting (<pre>)
  2. Defining terms (<dfn>)
  3. Abbreviations (<abbr>)
  4. Highlighting text (<mark>)
  5. Striking through text (<s>)
  6. The <ins> and <del> elements
  7. Quotations (<q> and <blockquote>)
  8. Citations (<cite>)

Preformatting via <pre>

We'll start with one of the most commonly-used elements in this chapter — <pre>.

The <pre> element is used to represent any block of text with all the whitespaces in it preserved, hence the name 'pre'.

But what does this exactly mean to preserve all the whitespaces?

Well, by default, HTML is configured to reduce down every single sequence of whitespace characters (which include spaces, tabs and newlines) to a single space character.

Consider the following code:

<p>This is    spaced-out     text.<p>

We have some text inside a <p> element that consists of large spaces. Surprisingly though, the output doesn't resemble the source code.

This is spaced-out text.

Both the sequences of spaces get reduced down a single space.

To override this behavior, we can leverage the <pre> element in HTML.

Let's replace the <p> element above with a <pre> element:

<pre>This is    spaced-out     text.</pre>

Here's the output:

This is    spaced-out     text.

Evidently, now the rendered text resembles its layout in the source code, thanks to <pre>.

However, this comes with an issue, as you might've already realized. That is, now the text is styled using a monospaced font; previously, the text inside <p> wasn't based on a monospaced font.

<pre> renders text with a monospaced font because it is often used to annotate blocks of code in HTML, as we shall discover in HTML Code, and blocks of code are typically always presented using a monospaced font.

Anyways, the only way to alleviate this issue is to use CSS, as demonstrated in the following snippet.

Overriding the default, monospaced font of <pre>

To override the default font configured for <pre> elements, we'll use CSS.

The world of CSS is pretty huge and complex but not difficult by any means. Let's just quickly go through the CSS approach to make <pre> forget about its default monospaced font styling.

We'll use the font-family property and set it to inherit to signal that whatever font family is configured on the parent of the <pre>, it should use that. This will, in effect, make the font used by <pre> the exact same as the font used by <p>.

Here's the code:

<pre style="font-family: inherit">This is    spaced-out     text.<pre>

And here's its output:

This is    spaced-out     text.

Despite the fact that we did indeed solve the font issue presented by using <pre> in place of <p>, thanks to some quick CSS, the actual example above wasn't meant to demonstrate replacing <p> with <pre>.

Instead, the code above was only meant to show that <pre> preserves all the whitespace that exists in the content that it holds. That's it.

In practice, if for some reason we want to preserve all the whitespace inside a <p> element, then we'd be better off at just styling itself using CSS, instead of replacing it with a <pre> and then customizing its font.

Defining terms via <dfn>

Suppose we're building a glossary page of some programmatic terms. Each term has a definition associated with it. Let's take the term 'data type' as an example.

Now, one possible way to lay out the definition of 'data type' is as follows:

<p>A data type is a set of values along with the operations possible on those values.</p>

A data type is a set of values along with the operations possible on those values.

This sure works but it isn't that good.

When HTML provides an element to precisely indicate that we are defining a term, then we must use that element when doing so. That element is called <dfn>.

The <dfn> element stands for 'definition' and serves to represent the term being defined.

It's crucially important to note that <dfn> does NOT contain the definition itself; it only marks up the term being defined. The definition itself is part of the surrounding element.

This is an imperative point to remember. Many people might confuse <dfn> to be meant for containing definitions, where instead it's only meant for containing the term being defined, and to be a part of the definition itself.

Let's see an example to help clarify what this means.

In the following code, we fix the not-so-good HTML code above using <dfn>:

<p>A <dfn>data type</dfn> is a set of values along with the operations possible on those values.</p>

See how the term being defined, i.e. 'data type', is placed inside the <dfn> element which is itself a part of the entire definition of the term, inside the <p>.

Let's see how this renders:

A data type is a set of values along with the operations possible on those values.

Evidently, even <dfn> elements are styled by browsers, usually making the text italic.

As per the HTML standard, the text surrounding <dfn> must define the term marked up by <dfn>. That is, we shouldn't do the following:

<p><dfn>Data type</dfn>:</p>
<p>A set of values along with the operations possible on those values.</p>

Here, the text surrounding <dfn> (inside the first <p>) doesn't define the term that it contains. The definition is in another, separate element.

We should rather do the following:

<p><dfn>Data type</dfn>: A set of values along with the operations possible on those values.</p>

Data type: A set of values along with the operations possible on those values.

Or if we really want a new line after the term, then we can use <br> as follows:

<p><dfn>Data type</dfn>:<br>A set of values along with the operations possible on those values.</p>

Data type:
A set of values along with the operations possible on those values.

It's always important to keep this subtlety in mind — <dfn> contains the term being defined, NOT the definition.

It's invalid to nest <dfn> inside a <dfn> element. Sensibly, this is a good design decision as it doesn't make any sense to nest <dfn>s within each other.

Abbreviations via <abbr>

What does HTML stand for? Yes, you're right, it stands for Hypertext Markup Language. Here, 'HTML' is called an abbreviation.

In HTML, there is a dedicated element to mark up abbreviations. It's the <abbr> element.

The <abbr> element contains the abbreviation of a given term.

Ideally, the full-form should be displayed as text, adjacent to the abbreviation, outside of the <abbr> element. But if, for some reason, it's not feasible or required to have the full-form displayed visually along with the abbreviation, then we have the provision of the title attribute.

The title attribute of an <abbr> element defines its full-form.

The content inside title is only shown as a tooltip when the mouse pointer is hovered over the <abbr> element; it's not rendered out of the screen.

Time for some examples.

Firstly, in the following code, we use the abbreviation 'HTML' along with its full-form right next to it in the text (this is desirable):

<p>One of the core technologies of the web is <abbr>HTML</abbr> (Hypertext Markup Language).</p>

One of the core technologies of the web is HTML (Hypertext Markup Language).

And below we have the abbreviation without its full-form actually displayed (but obviously still mentioned via the title attribute):

<p>One of the core technologies of the web is <abbr title="Hypertext Markup Language">HTML</abbr>.</p>

One of the core technologies of the web is HTML.

When the <abbr> element contains a title attribute, browsers customize its style, typically by applying a dotted underline. This indicates that the abbreviation's full-form can be seen by hovering over it.

Frequently, the need of <abbr> arises when we want to define an abbreviated term in HTML. Ideally, this should employ both the <dfn> and the <abbr> elements.

A concrete example follows:

<p><dfn><abbr>HTML</abbr> (Hypertext Markup Language)</dfn> is a markup language used to define the content, meaning and structure of webpages.</p>

HTML (Hypertext Markup Language) is a markup language used to define the content, meaning and structure of webpages.

Simple, ain't it?

A question one might have at this stage is that should we use <abbr> for every single abbreviation? The answer to this lies in the snippet that follows.

Should <abbr> be used always?

Let's say you are writing an article discussing about the history of HTML (an exceptionally long topic to cover in an article).

Certainly, it might not be that feasible to use <abbr>, whether that be with or without title, for every single occurrence of the word 'HTML'

It's more feasible and practical to have just one <abbr> at the start of the article (or wherever the abbreviation occurs first).

Moreover, as stated before, if it's OK to have the full-form next to the abbreviation, you should do that, or else just put it inside the title attribute of the <abbr> element.

Highlighting text via <mark>

Suppose we have a webpage, as part of an online dictionary website, defining a word along with some example sentences demonstrating the usage of the word.

Now in the example sentences, we wish to highlight every occurrence of the word in order to place special emphasis on it.

The question stands: which element should we use here?

Well, what we seek here is the <mark> element.

As per the name, <mark> is used to mark/highlight text in HTML. Browsers typically apply a yellow background color to the text highlighted via <mark>.

In the following, we demonstrate a basic example using <mark>; think of it as an excerpt from the dictionary website we were talking about above:

<h1>ecstatic</h1>

<p>When she received the news of her promotion, she was <mark>ecstatic</mark> and couldn't stop smiling.</p>

<p>The crowd was <mark>ecstatic</mark>, erupting into cheers and applause as their team scored the winning goal in the final seconds of the game.</p>
ecstatic

When she received the news of her promotion, she was ecstatic and couldn't stop smiling.

The crowd was ecstatic, erupting into cheers and applause as their team scored the winning goal in the final seconds of the game.

Good example, what do you say?

Striking through text via <s>

To explicitly show that some piece of text is no longer relevant on a webpage, a typical convention is to strike-through the text.

This is illustrated as follows:

The strike-through text effect
The strike-through text effect.

For this, we have the <s> element.

The <s> element is used to strike through the text that it marks up. As you can guess, the 's' here stands for 'strike-through'.

A pretty common and familiar scenario where <s> could be used is when showing the discount price of an item in an online store.

The price after applying the discount is important to the customer and so that's shown as it is. On the other hand, the original price is not that important — it's not the price that the customer will actually pay — and so it's shown with the strike-through effect via <s>.

In the following code, we demonstrate a similar example:

<h1>Loaf pan</h1>
<p>$6.99 (<s>$10.99</s>)</p>
Loaf pan

$6.99 ($10.99)

The prices of an item, a loaf pan, are shown before and after a flat sale; the new price is $6.99 while the old price was $10.99, marked up using <s>.

This looks good, doesn't it?

Obviously, in a real online store, the prices would be styled in a much better way along with the underlying item being sold; but still the idea of striking-through the original price using <s> would remain the same.

The <ins> and <del> elements

HTML5 introduced two new elements to be leveraged in web apps with editing capabilities in them, whereby certain pieces of text could be inserted and deleted.

These elements are <ins> and <del>, respectively.

The <ins> element is used to denote any insertion of text into the web application while <del> is used to denote any deletion.

By default, browsers style <del> the same as <s> while <ins> gets underlined.

And because <del> is styled the same as <s>, i.e. with a strike-through effect, we might be tempted to think that they are interchangeable but that's strictly NOT the case.

The difference between <s> and <del>

<s> should only be used whenever the underlying text wasn't or couldn't be literally deleted (in the context of the underlying web app). <del> should be used whenever we want to show text that has or can be literally deleted.

An easier way to think about this distinction is that if we have a web app where text could be inserted and/or deleted, we need <ins> and <del> to address these two concerns.

Whenever <ins> makes sense and deletion is also a functionality, make sure to use <del> instead of <s>.

Let's consider a basic example demonstrating <ins> and <del>.

In the following code, we have a simplistic to-do list set up, with deleted items represented via <del> and all other items represented via <ins>:

<ul>
   <li><ins>Do laundry<ins></li>
   <li><ins>Read Python book</ins></li>
   <li><del>Learn some JavaScript</del></li>
   <li><ins>Code some JavaScript</ins></li>
   <li><del>Teach some math</del></li>
</ul>
  • Do laundry
  • Read Python book
  • Learn some JavaScript
  • Code some JavaScript
  • Teach some math

Representing other items using <ins> might seem weird — after all, it's meant for insertions — but it's completely OK in this case as well.

Perhaps the most important point to note regarding both <ins> and <del> is the datetime attribute which represents the date and time at which the insertion/deletion was done.

The datetime attribute can be really handy if we want to keep track of the times of insertions and deletions in a standardized way, right in the HTML.

In the code below, we use datetime to store the time at which given items are inserted or deleted (the times are all dummy times):

<ul>
   <li><ins datetime="2023-10-03 11:25:00">Do laundry<ins></li>
   <li><ins datetime="2023-10-03 11:26:16">Read Python book</ins></li>
   <li><del datetime="2023-10-03 12:09:10">Learn some JavaScript</del></li>
   <li><ins datetime="2023-10-04 19:00:05">Code some JavaScript</ins></li>
   <li><del datetime="2023-10-05 03:26:09">Teach some math</del></li>
</ul>

So, for example, the second <ins> element along with its datetime means that the to-do task 'Read Python book' was added on 3rd October, 2023 at the time 11:26:16.

  • Do laundry
  • Read Python book
  • Learn some JavaScript
  • Code some JavaScript
  • Teach some math

The output produced is still the same as before, just that now the list items are tracked for their time of insertion/deletion as well.

Moving on, another useful aspect of <ins> and <del> is that they could be nested in one another. That is, we could have <del> inside <ins> and <ins> inside <del>.

But when could we possibly need that? Well, the to-do list above is a perfect example.

Let's say we add a new item to an empty to-do list. Then our HTML would look as follows:

<ul>
   <li><ins>Do laundry<ins></li>
</ul>

Now, let's say we delete this very item. In this case, we can place the <del> element around the <ins> element. Thereafter, our HTML would look as follows:

<ul>
<li><del><ins>Do laundry<ins></del></li>
</ul>
  • Do laundry

What this code says is simply that first 'Do laundry' was inserted and then later on it was deleted (start from the innermost element and walk your way outwards).

We could use CSS to remove the underline from <ins> elements that are a part of <del>.

Obviously not all to-do lists would work this way, but at least the example does demonstrate one case where <ins> and <del> could be nested in one another.

In short, <ins> and <del> might not be amongst the most commonly used elements in HTML, but they're really powerful.

If you're developing a web application of some kind with editing features in it, whereby certain things could be inserted and then later on deleted, and those deleted things ought to be shown as well, then do consider <ins> and <del>.

Quotations via <q> and <blockquote>

If we wish to quote some text from another resource, we should use either the <q> or <blockquote> element, depending on the way we need to do the quotation.

That is, if the quotation text is just one-liner and to be combined with other text, we should use <q> which is meant to hold simple, inline quotations.

Otherwise, if the quotation text needs to be a separate block, we should use the <blockquote> element, which can even contain a <p> element (though that's completely optional).

Following is an example is an example demonstrating <q>:

<p>She said, <q>HTML is fun to learn</q>, during the web development class.</p>

He said, HTML is fun to learn, during the web development class.

The quotation is to be inlined with other text in the paragraph, hence we denote it using <q>.

Following is an example demonstrating <blockquote>:

<p>Albert Einstein said:</p>
<blockquote>
    Two things are infinite: the universe and human stupidity; and I'm not sure about the universe.
</blockquote>

Albert Einstein said:

Two things are infinite: the universe and human stupidity; and I'm not sure about the universe.

This time, we wanted to put the quotation on a separate line, likewise we chose <blockquote>.

Equivalently, this example could be rewritten, with the quotation text further wrapped inside a <p> element:

<p>Albert Einstein said:</p>
<blockquote>
    <p>Two things are infinite: the universe and human stupidity; and I'm not sure about the universe.</p>
</blockquote>

Albert Einstein said:

Two things are infinite: the universe and human stupidity; and I'm not sure about the universe.

A common practice whenever quoting text from another resource is to refer to the resource. For this, we have the cite attribute, available on <q> as well as on <blockquote>.

The cite attribute is meant to contain the link to the resource from where text is quoted.

cite doesn't turn the quoted text into a hyperlink; it's only meant for surplus citation information.

In the following example, we quote some text regarding the <blockquote> element from the HTML standard, and add its URL to the cite attribute:

<p>As per the definition of <code>blockquote</code> on the HTML Standard:</p>

<blockquote cite="https://html.spec.whatwg.org/multipage/grouping-content.html#the-blockquote-element">
    Content inside a <code>blockquote</code> must be quoted from another source, whose address, if it has one, may be cited in the <code>cite</code> attribute.
</blockquote>

As per the definition of blockquote on the HTML Standard:

Content inside a blockquote must be quoted from another source, whose address, if it has one, may be cited in the cite attribute.

Citations via <cite>

When writing documents, citing sources of information and references is a pretty common activity. If you don't know about it, citing is simply to refer to the source of a given piece of content.

For example, if we take some content from the current chapter that you're reading right now, i.e. HTML Advanced Text, referring to the exact name of the chapter in the document where we use its content is an instance of citing the chapter.

In academic writing, there are many popular citation conventions. Some of the most common are APA, CMA, etc. If you're into any kind of academic writing and want to take those skills to the web, you should consider learning about these conventions and the <cite> element.

The <cite> element is used to define citations in HTML.

Again, if you're into writing content on the web by referring to a multitude of sources, you'll be using <cite> quite a lot.

It's important to note that <cite> is meant to contain the titles/names of given pieces of information, NOT the names of the people/authors who wrote those pieces.

By default, browsers tend to italicize <cite> content.

Let's consider an example.

In the following code, we refer to a portion of lyrics from a famous song while citing the name of the song (since that's the source):

<p>I love the part <q>Started out on a one way train</q> in the song <cite>Steal the show</cite> from the movie, <cite>Elemental</cite>.</p>

I love the part Started out on a one way train in the song Steal the show from the movie, Elemental.

Notice that since the song is part of a movie, it makes perfect sense to cite the movie as well.

In a similar sense, in the following code, we talk about some content from a book, again citing the book (since we are referring to information from it):

<p>In his book, <cite>The Content Code</cite>, Mark Schaefer, a renowned marketer, refers to the increasing rate of content production paired with the limited scale of content consumption as the <em>'Content Shock'</em>.</p>

In his book, The Content Code, Mark Schaefer, a renowned marketer, refers to the increasing rate of content production paired with the limited scale of content consumption as the 'Content Shock'.

Moving on, although <cite> doesn't have to contain a link, it typically does, especially when the cited source is available digitally.

Following is an example from the world of computing, something which you'll soon want to learn in your frontend career — the REST architecture:

<p>In his famous dissertation, <a href="https://ics.uci.edu/~fielding/pubs/dissertation/top.htm"><cite>Architectural Styles and the Design of Network-based Software Architectures</cite></a>, Roy Fielding discusses about the REST architectural style.</p>

In his famous dissertation, Architectural Styles and the Design of Network-based Software Architectures, Roy Fielding discusses about the REST architectural style which is basis for the HTTP protocol.

We refer to a concept from a technical dissertation that is available online and, likewise, make the citation a link.

Linking to a cited source that is available online isn't required, per se, but it's really good to do so if you have a link with you.

Keep in mind that it's recommended to use <cite> where we're really making some kind of a citation. That is, it's a little bit awkward to use <cite> in every other place where we're merely just referring to a given entity.

For example, let's say we're writing an article in which we talk about a website for the first time in some paragraph. Since it's the first time of referring to the website, it's a good idea to use <cite> here. But then on each instance of referring to the same website in the article, we might not use <cite> since the citation has already been made.

"I created Codeguage to save you from falling into the same learning conundrums that I fell into."

— Bilal Adnan, Founder of Codeguage