While HTML5 has a bunch of semantic elements, including new ones like
<nav>, sometimes there just isn’t an element with the right meaning. What we want are ways to extend what we’ve got, to add extra semantics that are machine-readable — data that a browser, script, or robot can use.
Native ways to extend HTML
There were five fundamental ways we could extend HTML 4:
rev has fallen by the wayside, becoming obsolete since hardly anyone used it correctly, and because it can be replaced by
profile is also obsolete, and there is no support for namespaces in HTML5. However,
rel are all in HTML5. In fact,
<meta> now has spec-defined
names and a way to submit new
name values, and
rel has several new link types defined in the HTML5 specification and a way to submit more too.
Even better, WAI-ARIA’s
aria-* attributes are allowed in HTML5, and HTML5 validators can check HTML5+ARIA. Other new methods of extending HTML5 include custom data attributes (
data-*), microdata, and RDFa.
Finally there are microformats. As Dr. Bruce touched on microformats in his article on the
<time> element, let’s delve a little deeper into what microformats are and how to use them in HTML5.
Microformats are a collection of vocabularies for extending HTML with additional machine-readable semantics. They are designed for humans first and machines second. This is currently accomplished via agreed-upon
profile attributes, coding patterns, and nesting.
Being machine readable means a robot or script that understands the microformat vocabulary being used can understand and process the marked-up data. Each microformat defines a specific type of data and is usually based on existing data formats — like vcard (address book data; RFC2426) and icalendar (calendar data; RFC 2445) — or common coding patterns (‘paving the cowpaths’ of the web).
Despite their humble beginnings, microformats have also been a runaway success, and they're far more widely deployed than other “big S” Semantic web technologies. For example, many services, such as Twitter and Flickr, offer profile information in hCard format by default, so you may already have a microformatted profile even if you’ve never used microformats before.
There are currently 34 microformats specifications (listed below) in varying levels of completion, and you can find out more about them on the microformats wiki.
- Elemental (simple) microformats
- XFN — specify relationships with people
- XMDP — add metadata profiles
- VoteLinks — indicate agreement or disagreement with, or indifference to, the link’s destination
- rel-nofollow — don’t give ‘weight’ to a link (don’t give ‘Google juice’; mainly for search engines)
- rel-license — specify license information
- rel-tag — add tags
- Compound microformats
- hCard — contact information for people and organisations
- hCalendar — time-based information, such as events
- XOXO — outlines
- robots exclusion
Despite being drafts, many of these are in widespread use.
These common coding patterns are just best practices that are frequently used in writing plain old semantic HTML (POSH) to create microformats.
- date design pattern
- datetime design pattern
- include pattern
These specifications and patterns cover many common types of data. They’ve been created by a grass-roots organisation of interested people, and anyone is welcome to contribute or even propose a new microformat.
A lightning introduction to using microformats
For those of you that haven't used microformats before, I’ll briefly introduce some simple microformats — hopefully so simple you’ll be encouraged to use them right away!
Using rel-license for licensing information
Adding license information is quite a common activity, and while we can add a link to Creative Commons or another license easily enough, someone would have to read it to understand the content’s license:
<small>This article is licensed under a <a href="http://creativecommons.org/licenses/by-nc-sa/2.0/"> Creative Commons Attribution Non-commercial Share-alike (By-<abbr>NC</abbr>-<abbr>SA</abbr>) license</a>. </small>
If this information was machine-readable, search engines could use it to help consumers searching by license. By using the
rel-license microformat to add
rel="license" to the license link (indicating it’s the license for the page’s main content), we can do just that:
This may be so easy you don’t even realise you’ve just microformatted the link! In fact, Google already uses this data to allow us to search by license (look in advanced search, in the extra options).
Using XHTML Friends Network (XFN)
Maybe we should term this one eXtensible Friends Network instead :-) XFN is a way of specifying your relationship with people — everything from “met” to “sweetheart” — using the
rel attribute on a link to their homepage.
There are two main values:
rel="contact" for someone you know how to get in touch with, and
rel="me" value allows you to claim ownership of your various websites, including your accounts on social networks. For example, I could have a profile with a link to my Twitter account:
<p>Oli Studholme — <a href="http://twitter.com/boblet">follow me on Twitter (@boblet)</a></p>
A person can infer that @boblet is my Twitter username, but by adding
rel="me" we can state this relationship in a machine-readable way.
<p>Oli Studholme — <a rel="me" href="http://twitter.com/boblet">follow me on Twitter (@boblet)</a></p>
In order to actually work, this would need to be on my personal homepage, with the same homepage added to my Twitter profile. By doing this, a social web app that understands XFN could confirm @boblet is me, check my friends on Twitter, check if those people are already registered, and then allow me to follow them all with one click — much easier.
For more on this idea, check out Identity consolidation and the XFN
rel="me" value, the RelMeAuth proposal on the microformats wiki, and Google’s Social Graph API. It seems Facebook is supporting
Rel-license and XFN are simple
rel-based microformats, but even with their simplicity you can see the potential power in this machine-readable stuff. Now let’s look at microformats for contact and event information.
Using hCard for contact information
Almost every website has an about page with some contact information:
<p>Oli Studholme — <a href="http://oli.jp/">http://oli.jp</a>, or <a href="http://twitter.com/boblet">follow me on Twitter (@boblet)</a>.</p>
Unfortunately, adding someone’s contact information to your phone or address book generally involves a lot of copying and pasting. If the data was machine-readable, however, we could use a tool to do that. Let’s add the hCard microformat to this code snippet.
So we’ve added a bunch of classes. There’s nothing special about that, of course, until you realise they're all part of the hCard microformat. The first one is
vcard on the containing
<p> element, indicating that there's hCard data here. Then we have
fn, which stands for full name,
url for an associated homepage, and
nickname for, well, a nickname.
This is a simple example, and in fact a valid hCard (generally) only requires
fn. But hCard has much more depth. We can mark up all kinds of contact-related data: addresses, company information, even a profile photo. For example, we can explicitly specify a given name and family name (this is required for Chinese, Korean, Japanese and Vietnamese names), even middle names and titles.
<span class="vcard"><span class="fn n" lang="ja"><span class="family-name"> スタッドホルム</span>・<span class="given-name">オリ</span></span></span>
n value is actually required, but can be inferred (the so-called “implied
n optimisation”) and therefore omitted for names that follow these patterns:
- given-name family-name
- family-name, given-name
So what’s the benefit of this? Well, there are several tools that will convert this hCard-microformatted data into a vcard file we can download and automatically add to our address book. Nifty!
Organisations, addresses, and phone numbers
Smaller company websites often have the company name and contact details in the footer, so let’s briefly see how to do that.
fn together with
org we can create an hCard for an organisation. Each part of the address is specified (
<span>licious!), and we’ve also included a phone number (
tel’s default type is voice) and fax number (specifying
type and using the value class pattern).
And with Operator, we can add this hCard to our computer’s address book in one click. If you’ve ever manually added an address to your address book, you’ll love this ;-)
Using hCalendar for event information
Let’s briefly look at marking up a simple event. Here’s one I’m involved with:
<h3><a href="http://atnd.org/events/5181">WDE-ex Vol.11 — Designing for iPad: Our experience so far</a></h3> <p>July 21st 19:00-20:00 at <a href="http://www.apple.com/jp/retail/ginza/map/">Apple Ginza</a>.</p>
Just by reading, we understand the event name, date, time, and location. But this information is difficult for computers to extract. By using the hCard microformat, it will be machine-readable.
vevent on a wrapping element to indicate this is an hCalendar. Events are required to have a summary and a starting time, so we’ve added
summary and indicated the starting date and time using
dtstart. While historically datetimes were indicated using
<abbr title="">, the required ISO 8601 format (for example “
2010-07-10T19:01:29”) is very unfriendly to screen readers. Imagine having those numbers read to you! For this reason, the hCalendar specification recommends either breaking the datetime into separate date and time pieces, using the value class pattern, or possibly doing both. The specification is also smart enough to know that if the end datetime
dtend is only a time, then it’s on the same day, so we don’t need to specify the date again, although tool support for this is spotty. Finally, we’ve added some URLs, one of which is for the event’s location.
With this data now being machine-readable, we can do things with it. For example, we can add the event to a calendar app with one click.
Using microformats in HTML5
So now we’ve had a whistle-stop introduction to microformats and gotten a glimpse of the benefits they provide. The question is: Can we use them in HTML5 pages? Let’s start with the good news.
As most microformats use only
rel — basic parts of HTML that remain unchanged in HTML5 — these microformats are completely fine in HTML5. Yay! However there are a few wrinkles to keep us on our toes.
profile attribute is obsolete in HTML5, as it was determined unnecessary. Currently, microformat profiles are very rarely used in the wild and are not required, so this probably won’t affect you.
Microformats specifications vs HTML5
While microformats using
rel work fine, some of the new features in HTML5 aren’t supported yet. The Microformats in HTML5 page on the microformats wiki currently begins:
This page is to document future use of microformats in HTML 5. None of the items documented are supported now, and may change upon proper development within the microformats community, or changes in the HTML 5 specification.
This includes the nifty new
<time datetime=""> element, which would be perfect for microformat times, such as
dtend in the hCalendar microformat. It’s also true for adding microformats via microdata rather than
rel. There has been talk of a general way to map any microformat into microdata, but this hasn’t eventuated as of yet.
Microformats-consuming tools (or, the problem with HTML Tidy)
The microformats wiki’s warning about
<time> brings us to another caveat: tool support. Even if
<time datetime=""> was valid in the microformats spec, at present most of the microformats tools can’t get this information from the
datetime="" attribute. Now, you could wrap
<time datetime=""> around a microformats
<time datetime="2010-07-21T19:00:00+09:00"><span class="dtstart"> <abbr class="value" title="2010-07-21">July 21st</abbr> <span class="value">19:00 </span></span></time>
However, as the
<time datetime=""> data is then not explicitly part of the microformat, doing this isn’t really getting you anywhere, except bloated code-ville.
The problem is actually way worse as many tools use the parser HTML Tidy (“last updated March 2009”!), which probably won’t be fixed for new HTML5 elements. This means any microformats classes on new HTML5 elements will be ignored. The potential alternative html5lib.php is still pretty young. You can get around this by wrapping HTML5 elements in
<spans> and applying the microformats classes to them, but that’s hardly ideal.
However, there’s a ray of sunshine here. The tool you’ll most want to use is H2VX.com — it converts hCard and hCalendar microformats into vcard and ical files for users to download and add to their address books and calendars, respectively. While h2vx.com uses Tidy, Tantek Çelik and Brian Suda (the maintainers of H2VX) have a version of H2VX that does work with HTML5 elements and
datetime attribute. We’re saved! Check it out at dev.h2vx.com.
Microformats Tool support
|can use classes on HTML5 elements||yes||yes||no|
|supports HTML5’s ||no||no||no|
|Understands the value class pattern||yes||no||yes|
|Understands implied ||yes||no|
I didn’t expect anyone to support
<time> until it became part of a microformat specification. Given how … non-micro expressing datetimes is in microformats currently, I am very thankful H2VX has in dev.h2vx.com!
So where does that leave us? Well, if you keep the above caveats in mind you can safely use microformats in HTML5. Google has no problem with microformats in HTML5, even on new elements, and I’m under the impression that Yahoo is the same. However, remember that if you want to offer a “click to download” link via H2VX, you’ll have to use dev.h2vx.com.
An important thing to note about microformats: as wonderful as they are, the current method of using the plain old semantic HTML tools of
rel, and coding patterns is something of a brilliant hack. The use of
class (an author playground) and the lack of use/demise of
profile, combined with the complexity of coding to detect for microformats in the wild, means that native browser support is unlikely anytime soon. There was the expectation that at some stage there’d be a better way to easily add extra semantic information for others to use. In future articles, we’ll look at microdata and RDFa, and see what they have to offer over plain old semantic HTML.
If you have common types of data in your content that are covered by a microformat, and you want to make them machine-readable for others (and who wouldn’t), you should definitely consider using microformats in HTML5. Adding microformats is basically adding a lightweight API to your content, and is simple enough that even a designer can do it! This helps search engines and savvy users to more easily use your content. Check out the microformats wiki, grab the indispensable microformats cheat sheet by Brian Suda, peruse these books and DVDs on microformats, and start getting (even more) semantic!
Books and DVDs
- Microformats: Empowering Your Markup for Web 2.0 — John Allsopp
- Designing with Microformats for a Beautiful Web (DVD) — Andy Clarke
- Microformats Made Simple — Emily P. Lewis and Tantek Celik
- Minor copyedits and the major addition of dev.h2vx.com support, with thanks and apologies for the delay to Brian Suda and Tantek Çelik.