Audio and video are such an Important part of the modern Internet.
Podcasts, audio previews, and even how-to videos are everywhere, and
until now, they’ve only been truly usable using browser plug-ins.
HTML5 introduces new methods to embed audio and video files into
a page. In this chapter, we’ll explore a few methods we can use to not
only embed the audio and video content but also to ensure that it is
available to people using older browsers.
We’ll discuss the following two elements in this chapter
<audio> [<audio src=”drums.mp3″x/audio>]
Play audio natively in the browser. [C4, F3.6, IE9, S3.2, OlO.l,
<video> [<video src=”tutorial.m4v”x/video>]
Play video natively in the browser. [C4, F3.6, IE9, S3.2, 010.5,
Before we do that, we need to talk about the history of audio and video
on the Web. After all, to understand where we’re going, we have to
understand where we’ve been.
A Bit of History
People have been trying to use audio and video on web pages for a long
time. It started with people embedding MIDI files on their home pages
and using the embed tag to reference the file, like this:
<embed src=”awesome.mp3″ a u t o s t a r t = ” t r u e ”
1oop=”true” control 1er=”true”></embed>
The embed tag never became a standard, so people started using the
object tag instead, which is an accepted W3C standard. To support
older browsers that don’t understand the object tag, you often see an
embed tag nested within the object tag, like this:
<param name=”src” va”lue=”simpsons.mp3″>
<param name=”autoplay” value=”false”>
<param name=”contro”ner” value=”true”>
<embed src=”awesome.mp3″ a u t o s t a r t = ” f a 7 s e ”
1 oop=”false” control 1 er=”true”></embed>
Not every browser could stream the content this way, though, and not
every server was configured properly to serve it correctly. Things got
even more complicated when video on the Web became more popular.
We went through lots of iterations of audio and video content on
the Web, from RealPlayer to Windows Media to QuickTime. Every company
had a video strategy, and it seemed like every site used a different
method and format for encoding their video on the Web.
Macromedia (now Adobe) realized early on that its Flash Player could
be the perfect vehicle for delivering audio and video content across platforms.
Flash is available on close to 97 percent of web browsers already.
Once content producers discovered they could encode once and play
anywhere, thousands of sites turned to Flash streaming for both audio
Then Apple came along in 2007 with the iPhone and iPod touch and
decided that Apple would not support Flash on those devices. Content
providers responded by making video streams available that would play
right in the Mobile Safari browser. These videos, using the H.264 codec,
were also playable via the normal Flash Player, which allowed content
providers to still encode once while targeting multiple platforms.
The creators of the HTML5 specification believe that the browser should
support audio and video natively rather than relying on a plug-in thatrequires a lot of boilerplate HTML. This is where HTML5 audio and
video start to make more sense: by treating audio and video as firstclass
citizens in terms of web content.
Containers and Codecs
When we talk about video on the Web, we talk in terms of containers
and codecs. You might think of a video you get off your digital camera as
an AVI or an MPEG file, but that’s actually an oversimplification. Containers
are like an envelope that contains audio streams, video streams,
and sometimes additional metadata such as subtitles. These audio and
video streams need to be encoded, and that’s where codecs come in.
Video and audio can be encoded in hundreds of different ways, but
when it comes to HTML5 video, only a few matter.
When you watch a video, your video player has to decode it. Unfortunately,
the player you’re using might not be able to decode the video you
want to watch. Some players use software to decode video, which can
be slower or more CPU intensive. Other players use hardware decoders
and are thus limited to what they can play. Right now, there are three
video formats that you need to know about if you want to start using
the HTML5 video tag in your work today: H.264, Theora, and VP8.
Codec and Supported Browsers
[IE9, S4, C3, IOS]
[F3.5, C4, 010]
[IE9 (if codec installed), F4, C5, 010.7]
H.264 is a high-quality codec that was standardized in 2003 and created
by the MPEG group. To support low-end devices such as mobile
phones, while at the same time handling video for high-definition devices,
the H.264 specification is split into various profiles. These profiles
share a set of common features, but higher-end profiles offer additional
options that improve quality. For example, the iPhone and Flash Player
can both play videos encoded with H.264, but the iPhone only supports
the lower-quality “baseline” profile, while Flash Player supports
higher-quality streams. It’s possible to encode a video one time and
embed multiple profiles so that it looks nice on various platforms.
H.264 is a de facto standard because of support from Microsoft and
Apple, which are licensees. On top of that, Google’s YouTube converted
its videos to the H.264 codec so it could play on the iPhone, and Adobe’s
Flash Player supports it as well. However, it’s not an open technology. It
is patented, and its use is subject to licensing terms. Content producers
must pay a royalty to encode videos using H.264, but these royalties do
not apply to content that is made freely available to end users.2
Proponents of free software are concerned that eventually, the rights
holders may begin demanding high royalties from content producers.
That concern has led to the creation and promotion of alternative
Theora is a royalty-free codec developed by the Xiph.Org Foundation.
Although content producers can create videos of similar quality with
Theora, device manufacturers have been slow to adopt it. Firefox,Chrome, and Opera can play videos encoded with Theora on any platform
without additional software, but Internet Explorer, Safari, and
the iOS devices will not. Apple and Microsoft are wary of “submarine
patents,” a term used to describe patents in which the patent application
purposely delays the publication and issuance of the patent in
order to lay low while others implement the technology. When the time
is right, the patent applicant “emerges” and begins demanding royalties
on an unsuspecting market.
Google’s VP8 is a completely open, royalty-free codec with quality similar
to H.264. It is supported by Mozilla, Google Chrome, and Opera,
and Microsoft’s Internet Explorer 9 promises to support VP8 as long as
the user has installed a codec already. It’s also supported in Adobe’s
Flash Player, making it an interesting alternative. It is not supported in
Safari or the iOS devices, which means that although this codec is free
to use, content producers wanting to deliver video content to iPhones
or iPads still need to use the H.264 codec.
As if competing standards for video weren’t complicating matters
enough, we also have to be concerned with competing standards for
Codec and Supported Browsers
[S4, C3, IOS]
[IE9, S4, C3, IOS]
[F3, C4, 010]
Advanced Audio Coding (AAC)
This is the audio format that Apple uses in its iTunes Store. It is designed
to have better audio quality than MP3s for around the same file
size, and it also offers multiple audio profiles similar to H.264. Also,
like H.264, it’s not a free codec and does have associated licensing fees.
All Apple products play AAC files. So does Adobe’s Flash Player and the
open source VLC player.
This open source royalty-free format is supported by Firefox, Opera,
and Chrome. You’ll find it used with the Theora and VP8 video codecs
as well. Vorbis files have very good audio quality but are not widely
supported by hardware music players.
The MP3 format, although extremely common and popular, isn’t supported
in Firefox and Opera because it’s also patent-encumbered. It is
supported in Safari and Google Chrome.
Video codecs and audio codecs need to be packaged together for distribution
and playback. Let’s talk about video containers.
Containers and Codecs, Working Together
A container is a metadata file that identifies and interleaves audio or
video files. A container doesn’t actually contain any information about
how the information it contains is encoded. Essentially, a container
“wraps” audio and video streams. Containers can often hold any combination
of encoded media, but we’ll see these combinations when it
comes to working with video on the Web
• The OGG container, with Theora video and Vorbis audio, which
will work in Firefox, Chrome, and Opera.
• The MP4 container, with H.264 video and AAC audio, which will
work in Safari and Chrome. It will also play through Adobe Flash
Player and on iPhones, iPods, and iPads.
• The WebM container, using VP8 video and Vorbis audio, which will
work in Firefox, Chrome, Opera, and Adobe Flash Player.
Given that Google and Mozilla are moving ahead with VP8 and WebM,
we can eliminate Theora from the mix eventually, but from the looks of
things, we’re still looking at encoding our videos twice—once for Apple
users (who have a small desktop share but a large mobile device share
in the United States) and then again for Firefox and Opera users, since
both of those browsers refuse to play H.264.3
That’s a lot to take in, but now that you understand the history and the
limitations, let’s dig in to some actual implementation, starting with