Video: the track element and webM codec

by .

There are a couple of interesting developments in the world of HTML5 multimedia that you’ll be interested in. The first is the new <track> element (currently only in the WHAT-WG spec due to political stuff).

<track> is a child of a <video> or <audio> element that links to a timed track, or time-based data. The kind of data is set via a kind attribute, which can take values of subtitles, captions, descriptions, chapters or metadata, depending on the type of information you’re adding to your media. These point to a source file containing timed text that the browser will expose via some kind of user interface, if the visitor requires additional data.

This will allow for “write once, use everywhere” accessibility; anyone linking to that file with a <video> or <audio> element that includes the element can access your information.

<track kind=captions src=blah.srt>

The file format is a new format called WebSRT, which competes with about 50 other timed formats, including some W3C formats (hence the omission of <track> from the W3C spec).

Added 9 July 2010: Here’s a readable overview of the SRT format.

Given that the format itself isn’t fully specified yet, it will be a while until we see implementation in browsers. But it’s good to know that there will be an official way to add accessibility information to media. Until then, I have a JavaScript hack to take timed <span>s out of an in-page transcript to superimpose over a video.

WebM video format

The big news of the last month is that Google open-sourced the VP8 video codec that it acquired when it took over On2 Technologies. When combined with the Vorbis audio codec (that Spotify uses) and wrapped in a subset of the Matroska container format, it’s collectively known as WebM.

All YouTube videos are being transcoded to WebM, and Adobe have also announced they will include it in their Flash player. It’s available in an Opera 10.60 beta, a Firefox testing build, and a Chromium dev channel. Even Microsoft have said that IE9 will support it if the codec is installed on the computer.

The VP8 video codec itself is high-quality (Google had said that Ogg Theora wasn’t good enough compression-to-quality for YouTube, but Theora was based on the VP3 precursor to VP8). It’s available for streaming too.

If you want to encode to WebM, you can try the Miro Video Converter utility. Although it doesn’t allow you to optimise settings, it’s very easy to use. As the codec becomes more widespread, expect to see many more tools for content creation, editing, and transcoding.

Once production versions of the browsers are available, you should encode your videos with WebM as the first option, Ogg for older versions of Opera, Firefox and Chrome, falling back to royalty-encumbered H.264 for Safari and links to downloads or a Flash player for legacy browsers:

<video controls>
<source src=video.webm type='video/webm; codecs="vorbis,vp8"'>
<source src=video.mp4 type='video/mp4; codecs="avc1.42E01E, mp4a.40.2"'>
<source src=video.ogv type='video/ogg; codecs="theora, vorbis"'>
<!-- embed Flash here -->
<p>Your browser does not support video; download the <a href="video.webm">WebM</a>, <a href="video.mp4">mp4</a> or <a href="video.ogg">Ogg</a> video for off-line viewing.</p>
</video>

If, however, you’re having problems with the iPad, put the MP4 version first in the <source> element; apparently there’s a bug that causes the iPad only to see the first <source> element.

It’s a long haul, and it’s not over yet, but <track> and .webM show significant progress towards our goal of accessible, open, and royalty-free video playing natively in the browser.

33 Responses on the article “Video: the track element and webM codec”

Kroc Camen says

You need to list the MP4 first in the source list due to an iPad bug that ignores anything but the first source element.

Shelley says

I call BS on the “political stuff”.

Ian wanted to invent his way of doing things, disregarding the work underway from people a whole more experienced at audio/video then he was. He was so afraid that a group of people may actually contribute something to the HTML5 spec, that he undercut their effort by unilaterally adding this element in — without discussion with anyone in the HTML WG.

When people asked him to file a proposal for the change, he had a hissy fit and yanked it in the W3C. And the people questioned other aspects, not necessarily the track element.

So where is this political coming from? Not from the W3C. This one was pure WhatWG and Ian Hickson BS.

bruce says

Shelley, I don’t claim that the W3C is the organisation that’s doing the politicking.

Also, it should be stated that while each of us reads the others’ posts before they go live and suggests changes etc, posts authored by any of us are our personal opinion and not those of the other doctors.

Also, for the record. in the members-only vote about which timed track format should be supported, I voted for DFXP rather than SRT.

Shelley says

If we had only one version of HTML5, we wouldn’t have this confusion, we wouldn’t have these discussions about politics, and we would be one group of people having to work through problems in order to push something out the door.

As far as I know, nothing has been decided in the web community about the element, the format, and discussion is still ongoing. This discussion isn’t political: it is good, solid, standards work.

bruce says

“If we had only one version of HTML5, we wouldn’t have this confusion”

doubtless. But it’s not where we are now.

Doug says

I understand that browsers will scan down the list of <source> elements until it comes across one it recognizes. But what will browsers do if they recognize the codec, but the file itself is missing? Will it fail or will it then try the next source that it recognizes?

IOW, if I change all my code for embedded video to include a line for WebM before I’ve finished actually encoding all the videos to WebM, will they still fall back to an existing video file in a different codec or will they just break?

Bruce Lawson says

Doug, in my testing, if you point to a file that doesn’t exist but with a MIME type that the browser can play, you get a blank space for the video (with the controls bar if you’ve specified controls).

I suggest you read the incredibly comprehensive article by SImon Pieters called Everything you need to know about HTML5 video and audio for more.

Brian LePore says

Maybe this might not be the right place for it, but I can’t seem to get anything webM to go through anywhere (what, do I smell or something?), but does anyone know how to appropriately update a magic file in Linux so that I can get the appropriate MIME type for webM files? I have already updated our code base to output webM files in our video module, but I am doing some hacks to get the MIME type right on this.

Also, didn’t Firefox have a bug in it that it too only played the first source element? It seems to work fine for me listing webM first, despite the current version of FF not supporting it, but I swear I’ve seen that listed as a bug that I cannot find at this time.

Does the iPad issue happen in all situations? I had a friend test out our system on the iPad when we rolled out HTML5 video (I only have an iPod touch and my co-worker tested on her iPhone). She said it worked fine for her.

bruce says

Brian

- I’ve had people say that sometimes iPad works fine even if .mp4 is not the first source element, hence my saying “If, however, you’re having problems with the iPad…” rather than always put the mp4 first.

But, I’m not beautiful or rich enough to own iThings or black turtleneck sweaters, so haven’t tested it myself.

tenshi's me2DAY says

겨미겨미의 생각…

Video의 자막에 WebSRT 라는 기술이 들어가나보네요. 브라우저 벤더들이 제일 좋아한다나……

Thomas says

Here is a new project of mine that you may find interesting – VideoSub is a MooTools based library that looks for track tags in HTML5 videos and if the browser does not have track support it will load and display the subtitles: http://www.storiesinflight.com/js_videosub/

The idea is for web developers to be able to implement SRT subtitle support with track tags right now based on the WHATWG spec proposal, with the VideoSub library gracefully stepping back once browsers have native support.

I hope to get some feedback from the community on improvements and possible new features for VideoSub…

Bruce Lawson says

Looks very promising Thomas. Couple of points: your code is

<track role="subtitle" src="jellies.srt"></track>

The spec has the kind attribute to say kind=subtitles http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#attr-track-kind (and note “subtitles” is plural).

The track element is also empty, in that it has no closing tag. Like img, it has either a trailing slash if you like XHTML syntax, or nothing. So you should use

<track kind="subtitles" src="jellies.srt">

That said, this looks pretty cool. I’m glad you’re building in feature detection to future-proof it, so that it will defer to browser’s native functionality if it’s supported.

Thomas says

Bruce – thanks for clearing up the role/kind difference – not sure where I got that from, but I guess there are different versions of the proposed spec flying around. I’ll update my examples.

As far as the feature detection is concerned – I check for the existence of the proposed addtrack() method of the video element. I’m hoping that browser makers will implement both track and addtrack() at the same time, we will see if that was a wise decision or if I have to go back and fiddle with that. :)

Thanks for the encouragement – I intend to add support for trackgroups and multiple tracks over the next month.

Bruce Lawson says

Thomas, I think trackgroup is gone from the spec too.

Thomas says

trackgroup is gone? NIce, makes the implementation even easier! :-) I’ve updated the example on my site with empty track tags and the kind property. There are definitely examples out there with “role” but that seems to have been changed in the proposed spec – a little bit unfortunate in my opinion, but “kind” will do…

Thanks for the help!

Bill says

Glad Subtitles are being brought in early instead of as an afterthought.. Won’t there be confusion though, if the WebSRT has the same extension as SubRip?

John Foliot says

Bruce,

I guess it is worth mentioning at this time that WebSRT is but one of a few time-stamp formats under consideration by the W3C; others include a profile of DFXP (TTML) – essentially what Flash media players support today – as well as a few other proposals (including a work in progress from Mozillian Silvia Pfieffer that the video experts at Opera, Web-Kit, Mozilla *and* Microsoft have all had a peak at) that will be significantly more ‘powerful’ than WebSRT, while at the same time as easy to author, so even suggesting that WebSRT will be a viable format at this time is a bit premature. (It is worth noting that Opera’s video guy – Philip Jägenstedt – is questioning (http://annevankesteren.nl/2010/05/websrt#comment-6967) both choice and functionality of WEbSRT with his fellow work-mate Anne VanKesteren, so this very much is in a very fluid state today.)

To say that it is the format that WHATWG’s benevolent dictator is favoring would be accurate, but he alone does not get to decide and it is very telling that despite being in the WHATWG spec for some time now, no browser has added support for either nor WebSRT. WebSRT is an attempt to fix the very flawed SRT format that emerged in the fan-sub community for sub-titles, but lacks many interesting and crucial requirements (for example, should caption files be able to contain hypertext links? We can provide use-cases for that need – how does WebSRT resolve that problem? what of semantic structure? should transcripts be able to also contain hierarchal data such as heading elements?)

@bill WebSRT and SRT are essentially both the same time-stamp format. SRT has never been standardized or specified; it was originally suggested as an ANSI format in a long ago misplaced email on a mailing list somewhere – there is not even an RFC for it (http://www.rfc-editor.org/rfcsearch.html)

@thomas: there has been no decision on one way or the other at the W3C: Ian Hickson has chosen at this time to not include it in the WHATWG version of the draft specification, but the use-case for is both real and demonstrable, and it may very well end up in the official spec, so i wouldn’t rule it out just yet.

At this writing (July 26th/10), accessibility of media in HTML5 has not been fully spec’d out – both the key engineers (responsible for media at the browsers) and accessibility specialists at the W3C are in productive and fruitful dialog as we work this out together. I will commit to keeping the doctors apprised of the patients progress.

John Foliot
Co-chair – W3C HTML5 Accessibility Task Force (Media)
http://www.w3.org/WAI/PF/HTML/wiki/Main_Page

bruce says

Good call, John.

For the record, I voted for DFXP rather than SRT to be the basis of the subtitling format.

Bill says

Any idea why it would use xml:id instead of just “id”?

It looks like that standard is close to timed text in it’s simplest form.

Web Axe says

I hope “trackgroup” makes it into the spec. It seems very easy, more clear, and useful. A good case is multiple languages for more than one type of track role. For example, you can have one trackgroup for English & Spanish for captions, and another trackgroup for English & Spanish for audio description.

Lexein says

I see you use SRT in your blog post above. Please PLEASE do not start calling WebSRT SRT. Whoever first said “let’s base it on SubRip” was onto something good. Now WebSRT is sufficiently different to deserve its own UNIQUE name, or to affix enforcement on the full name WebSRT.
In spite of the fact that SubRip’s ancient “.SRT” filename extension never passed through any standardization process, its long utility, broad support to this day and broad compatibility, should not be disregarded when naming a new, trans-set or superset format. Please, either always refer to WebSRT in full, or help me campaign for a new, non-conflicting name. It seems like early days for a correction to take root that will please everyone.

Charles Silverman says

DFXP is by far a smarter and mature format, and given that the Timed Text group has been at this for years now.
DFXP represents a group of folks who are incredibly experienced in all things captioning and description related, including the WGBH NCAM folks who have been working on web solutions for accessible media for years before anyone else was even thinking about these things,

I was flummoxed by WebSRT’s appearance. Seems incredibly redundant to say the least.

So due to politics or whatever, people with hearing loss get to wait while this all gets figured out. The format thing should have gotten worked out at the same time html and got worked out.

Oli Studholme says

In addition to Bruce’s link to Timed Track Formats (all 50+ of them) on the WHATWG wiki, here’s a few more wiki links that might interest you (with ballpark descriptions):

SmartWriteups says

someone please tell me how to add H.264 in my video tag??

is it like??

Please help I am confused… :(

Join the discussion.

Some HTML is ok

You can use these tags:
<a href="" title="">
<abbr title="">
<b>
<blockquote cite="">
<cite>
<del datetime="">
<em>
<i>
<q cite="">
<strong>

You can also use <code>, and remember to use &lt; and &gt; for brackets.