HTML: Media — Audio

HTML Audio

Learning outcomes:

  • The evolution of audio in HTML
  • The <audio> element
  • Adding controls using controls
  • Popular audio formats
  • Adding multiple audio files using <source>
  • Looping audio using loop
  • Providing fallback content

HTML5 — a game changer

If we go back in time, the era before HTML5 in particular, embedding audio on a webpage wasn't a trivial task.

At the very core, audio wasn't an intrinsic part of HTML. Developers had to resort to external plugins to add audio to a web page. These plugins themselves had quite inconsistent behavior across browsers.

Fast forward to HTML5 and the entire game changed.

Thanks to the advent of the <audio> element, embedding audio on a webpage became as simple as, if not simpler than, embedding an image.

And most importantly, audio became a fundamental part of HTML itself, not behind any plugins, allowing for extremely intuitive customization using some basic scripting.

In this chapter, we shall explore how to embed audio on a webpage; go through some popular audio formats; see the audio player interfaces offered by various browsers; how to show fallback content in case <audio> isn't supported; and much more.

Audio is the first step into exploring the rich media ecosystem of HTML, video being the other one, which we'll explore in detail in the next chapter.

The <audio> element

Just like we add an image to a document using <img> (or <picture>, for multiple formats of images), to add a piece of audio, we use the <audio> element.

The <audio> element represents an audio file.

Akin to <picture>, the <audio> element is a container element. In fact, there is a lot of similarity between <picture> and <audio>.

Not surprisingly, the most basic thing for <audio> is the destination to the audio file to embed. This goes with the src attribute.

If there are multiple variants of an audio file with differing formats, we can use child <source> elements in the <audio> element. We'll see examples of this later in this chapter.

For example, suppose we have a piano melody stored in an MP3 file, named piano-melody.mp3. (This melody has been taken from Freesound.)

We can easily embed it to the HTML document using the simple code below, supposing that the file is in the same directory as the HTML document:

HTML
<audio src="piano-melody.mp3"></audio>

And that's basically it.

However, surprisingly, for the first experience, this doesn't render anything on the document:

Live Example

So how are we supposed to play the audio then?

Well, that's precisely where the controls attribute comes in.

The controls attribute

By default, when we include an <audio> element in an HTML document, it does NOT have any user interface to control it.

The browser does load the linked audio file and is otherwise capable of playing it, for e.g. using JavaScript. However, the user can't at least play or interact in any way with the audio to begin with.

In this respect, the controls Boolean attribute instructs the browser to include UI controls for the <audio> so that the user can interact with it.

Let's add controls to our previous exampe and see what we get:

HTML
<audio src="piano-melody.mp3" controls></audio>

As you can see now, there is a small, nice audio user interface rendered by the browser in response to the application of controls to the <audio> element.

<audio> player on Chrome
Chrome's default <audio> player

Live Example

And this interface, just like any other user interface, differs from browser to browser. Shown below, for instance, is the default <audio> control for Firefox:

<audio> player on Firefox
Firefox's default <audio> player
Due to the discrepancies in the default, visual interface of the <audio> element in different browsers, it's common for developers to build their own interfaces and then power them using JavaScript and the Audio API.

Popular audio formats

Over the years, audio technology has seen a great amount of innovation in terms of the formats in which audio data is stored.

Due to the very nature of audio files — containing a lot more data than, let's say, images — they tend to have compression performed to store them as compactly as possible without much loss of quality.

In the realm of audio (and video), the mechanisms that encode to and decode from these storage formats are commonly known as audio codecs (for compressor-decompressor).

Compression is the process of going from the actual data of an audio (or any other file in general) to a compact, compressed representation of it. The compressor removes some bytes from the file in a way that most, or all, of them could be mathematically recovered during decompression.

And decompression then, quite apparently, is the process of going from the compressed data to the actual data.

A detailed discussion of how audio codecs work is out of the scope of this chapter. But it is an interesting read, containing a good amount of mathematics and computer science fundamentals.

Some of the most popular audio formats are stated below:

NameMIME typeDetails
MP3audio/mpegA very popular compressed audio format known for its balance of sound quality and file size. Widely used for music and streaming.
AACaudio/aacA compressed audio format, designed to be superior to MP3 in terms of compression and sound quality.
Oggaudio/oggAn open-source container format that supports multiple codecs (Vorbis for audio)
WAVaudio/wavA raw, uncompressed audio format that provides a high-quality output but results in large file sizes. Often used in professional audio editing.

Perhaps, MP3, or MPEG-1 Audio Layer III, is a format you might've already heard of. MP3 was first introduced in 1993 and has long been a popular format for audio. Its main motivation was to reduce the sizes of audio files by removing imperceptible parts within them.

AAC, or Advanced Audio Coding, is a relatively more modern audio format compared to MP3. In fact, it was designed to be superior to MP3 in terms of delivering better sound quality. AAC is a standard format for platforms like Youtube, Apple Music, etc.

Ogg is another popular media format. It essentially is a container format that supports both audio and video (as we shall find out in the next chapter). The audio codec of Ogg is Vorbis.

Ogg is not an acronym, hence it isn't stated in upper case.

Another popular format to store audio is WAV, or Waveform Audio Format. It's an uncompressed format that preserves the original quality of the audio and is likewise popular amongst music artists and sound professionals.

The <source> element

With so many competing formats to offer audio data, you might wonder how to set up an <audio> element in HTML with audio files in multiple formats, where the browser choosen the first one that it supports.

Does this remind you of something from the HTML Images unit?

How to denote an image in HTML, pointing to multiple files in differing formats, and then let the browser select the first one (in the given order) that it supports?

You're right! We use the <picture> element for this.

But what goes inside <picture>? Well, the answer is <source>.

The <source> element inside <audio> exhibits semantics identical to the ones it exhibits in <picture>. In particular, it represents a different format of the same audio data.

For example, let's say we have two versions of our piano melody: one is an MP3 file and one is the more modern ALAC type. On browsers that support ALAC, we want it to be used instead of the MP3 file.

To accomplish this, we'd write something as follows:

HTML
<audio controls>
   <source src="piano-melody.alac" type="audio/alac">
   <source src="piano-melody.mp3" type="audio/mpeg">
</audio>

The <audio> element contains two <source>s, each corresponding to a variation of the underlying audio data — the first one is in ALAC whereas the second one is in MP3.

As with <picture>, the order of <source>s matters inside <audio>.

Notice the similarity of this code with the one where we set up multiple image options using <picture>.

This similarity, however, isn't a coincidence. It's a part of HTML's elegant design to reuse ideas where they could be. That is, the same idea that we use to include multiple images and then choose the best one from amongst them in an HTML document is extended to audio (and even to video, as we shall see in the next chapter).

This effectively reduces the effort on our end in learning how to do something in HTML.

Coming back to <source> in the <audio> element, remember that the order of <source> matters.

For example, take a look at the following code:

HTML
<audio controls>
   <!-- MP3 now comes first, and then ALAC -->
   <source src="piano-melody.mp3" type="audio/mpeg">
   <source src="piano-melody.alac" type="audio/alac">
</audio>

Here, even if ALAC is supported, the loaded audio will be of the MP3 type. This is because the MP3 <source> comes first (and because MP3 files are supported by almost all browsers).

The loop attribute

By default, when we play an audio in HTML, as soon as it ends, it ends. In other words, the audio doesn't restart playing again from the beginning as soon as it ends.

However, we can change this behavior and instead get it to replay every time it reaches the end. Or as they say, we can get the audio to play in a loop.

This mandates the use of the loop attribute.

Open up the following example in another window and play the audio. Notice what happens as the audio ends:

HTML
<audio src="piano-melody.mp3" controls></audio>

Live Example

Now, compare this with the following example, with the application of loop:

HTML
<audio src="piano-melody.mp3" controls loop></audio>

Live Example

Again, notice what happens as the audio reaches its end.

It starts all over again, right? And that's exactly what loop is meant for.

Fallback content

Even though these days the <audio> element is supported completely on all major browsers, there is a common pattern to leverage to provide fallback content in case the element is not supported (for some reason).

Because <audio> requires the source of given audio data and not any literal text in itself, the content of <audio>, when its text, is reserved for this special purpose.

In other words,

The text inside of an <audio> element is used when the <audio> element is NOT supported by the browser.

Let's consider an example.

In the code below, we have our familiar piano melody audio embedded in an HTML document:

HTML
<audio src="piano-melody.mp3" controls>
   Your browser doesn't support the audio element. Download the audio at <a href="piano-melody.mp3">piano-melody.mp3</a>.
</audio>

The text within the element allows anyone whose browser doesn't recognize <audio> to download the audio by visiting the given link and then play it on his/her device using the operating system's configured audio player.

These days, there generally isn't any need to worry about the element <audio> not being supported. But many webpages built in the past usually have this pattern applied and so it's good to know about it in case you ever encounter it.

Spread the word

Think that the content was awesome? Share it with your friends!

Join the community

Can't understand something related to the content? Get help from the community.

Open Discord

 Go to home Explore more courses