HTML 5 + XML = XHTML 5

(Also available in Spanish Traducción de “HTML 5 + XML = XHTML 5″ and Portuguese.)

I like the xhtml syntax. It’s how I learned. I’m used to lowercase code, quoted attributes and trailing slashes on elements like br and img. They make me feel nice and comfy, like a cup of Ovaltine and The Evil Dead on the telly.

But you might not. You might want SHOUTY UPPERCASE tags, no trailing slashes and attribute minimisation. And, in HTML 5, you can choose.

Thanks to the “pave the cowpaths” principle, it’s up to you. As you like it. What you will. Whatever you want, whatever you like.

But let no-one tell you that HTML 5 kills XML—meet XHTML 5.

XHTML 5 is the XML serialisation of HTML 5 and, as you’d imagine, it has all the stricter parsing rules that you’d expect (and are used to if, like me, you grew up with XHTML DOCTYPES). It must be served with an XML MIME type, such as application/xml or application/xhtml+xml (so no rendering in Internet Explorer for the moment) and will throw a wobbly at the slightest well-formedness violation. (See Serving XHTML with the Right MIME Type for more information.)

Usual XML rules apply: no document.writes allowed, no DOCTYPE required, some syntax and script differences to trip up the unwary and you can use namespaces.

The main differences are summarised on the official WHATWG wiki Differences Between HTML and XHTML. It’s also possible to write polyglot documents that can be processed as either by browsers, depending on the MIME type used.

Magne emailed the Doctor to ask “Is it OK to use HTML5 tags in a page with the XHTML 1.1 doctype? Which one should I use, as in, which one is the recommendation now?”

If you want to use the new features, you need to use HTML 5 DOCTYPE or XHTML 5. Given that Internet Explorer cannot process XML, for pragmatic reasons the Doctor advises HTML 5.

Share and Save:
  • Twitter
  • Digg
  • Sphinn
  • Reddit
  • del.icio.us
  • StumbleUpon
  • Technorati
  • Netvibes
  • Facebook
  • Google Bookmarks
  • FriendFeed
  • HackerNews
  • LinkedIn
  • NewsVine
  • Tumblr

This article was written by Bruce Lawson. Bruce evangelises Open Web Standards for Opera. He's a member of the Web Standards Project's Accessibility Task Force and the W3C Mobile Best Practices Working Group. Previously, he was technical lead for the Solicitors Regulation Authority site. He's been a tarot card reader in Istanbul, a volunteer pharmacist in Calcutta, a Bollywood movie extra in Mumbai and English tutor to a Princess' daughter in Thailand. Nowadays, he blogs at www.brucelawson.co.uk, is training for his blue belt in kickboxing and drinks too much Guinness.

Posted on

Got anything to add or ask? then leave a response,
or trackback from your own site.

Categorised under: Attributes, Browser Compatibility, Structure.

Tagged with: , , .

46 Responses to “ HTML 5 + XML = XHTML 5 ”

Comment by Kroc Camen at

application/xhtml+xml? Yes, you can serve as pure XML, but xhtml+xml may be marginally more relevant.

Secondly, you’ll need XHTML5 in order to support Firefox 2 / Camino, so I would recommend including a link to the document on this site that covers that.

Comment by Bruce Lawson at

Kroc – well, sending html5 (in its html5 form) as xhtml just for a gecko parsing bug is not the same thing as “real” xhtml, so I’m loathe to confuse those two issues.

Re application/xml vs application/xhtml+xml – you’re right. It was a brain fart on a hot day and I’ve corrected the article; thanks.

Comment by Ben Ward at

Nice summary for those who love the XML. One additional detail I’d point out, is the reverse: Which is that the HTML version of the HTML5 syntax also makes XML-like self-closing elements valid for elements that are expected to be self-closing, e.g. you may choose to write <link />. As such, XHTML syntax as we know it is now HTML, too.

This, for those who want to continue writing XHTML in the HTML5 world is a bridge: They can write XHTML and serve it as HTML just as they do now, but it will be valid.

At some point, if IE ever plays ball, those people can swap over to the correct mime type and start using namespaces and so on.

Comment by IE6 falls; XHTML2 cancelled. | Wisdump at

[...] The good thing about XTHML was that it enforced well-formed markup, with strict provisions for lowercase code, quoted attributes, and trailing slashes for empty elements. Thankfully HTML5 this coding convention too, and can be served as a serialized XML document dubbed XHTML5. [...]

Comment by Dari at

Wait a second. Actually how do browsers know if document is xhtml 1.0 strict or xhtml 5? According to docs http://www.whatwg.org/specs/web-apps/current-work “XML documents may contain a DOCTYPE if desired, but this is not required to conform to this specification.” + note below. Because both documents share same namespace and doctype could be just thrown away all xhtml 1.x documents are probably xhtml 5 right now.

Comment by andy at

I thought that WHATWG criticised W3C for XHTML (XML usage in HTML).
Apart from xml mime type that is not supported by IE, there are more disavantages — xml parser will stop when welll-formed error occurs, browser must wait until the code for a page is fully loaded.

So, the only advatage of XHTML5 is cleaner, strict code?

Thanks

Comment by Torrance at

(Merde. Let me escape all my tags. Feel free to delete the prior post.)

Perhaps this should be in another article, but coming from writing XHTML 1.1, I’m having trouble understanding which tags no longer need to be closed, and which seem to be able to be omitted altogether.

I understand that self-closing tags no longer need the forward slash at the end, but I’ve seen someone suggest that <li> doesn’t need its respective closing tag, and nor even does <p>. Surely this is not true? And why is <body> no longer strictly necessary? What else has become optional?

I think for my own sanity, I’ll be sticking to XHTML syntax even in HTML 5. Makes me feel a lot cleaner knowing everything is neatly closed.

Comment by oli at

@Dari—XHTML5 and HTML5 are differentiated solely by mime type. That means unless you specifically use the application/XHTML+XML mime type it’s HTML5, and as mentioned in the article you’re probably not going to do that because IE would barf (what a surprise).

@andy—the main advantage of using XHTML (ie based on XML) would probably be the easy inclusion of other XML-based things like SVG, XForms and MathML. Another benefit of using XHTML1 has traditionally been stricter validation which is a big help to authors. Hopefully this will migrate to HTML5 browsers in the form of a strict mode or something. As the article says though, if you have *any* IE users then XHTML5 is probably more time than it’s worth.

Comment by andy at

@oli — yep, but if you have to use text/html in these days, you can’t put another xml application to the code. So only way how to do that is embed it via object element. Besides svg, I don’t know any other xml apps which is usable because of its poor support — MathML, CML, …

I use xhtml just because I like pure well-formed code, but any other objective reason I can’t say.

And as I mentioned, I can’t put up with unaccessible browsers’ xml parsers.

Comment by Dari at

@oli – I said about the difference between xhtml1 and 5, not html->xhtml. Because I don’t see any. Technically document with xhtml 1.0 doctype should be xhtml 1.0 but how do UA know that?

Comment by Magne Andersson at

Is there currently a validator that will validate HTML5 as XHTML (any version) Strict, as clean code was the only advantage I could make use of with XHTML, and I still want to be able to do?

Comment by Alohci at

The point about the character sequence <!DOCTYPE html> is that it is the shortest set of characters that will cause browsers to use standards mode, when the page is served as text/html. It’s not so much a doctype as an incantation. A page is only XHTML5 if it is served with a xml mime type (e.g. application/xhtml+xml), and in such cases browser use standards mode anyway. So while you can include <!DOCTYPE html> if you wish, it is entirely optional.

Comment by e-sushi at

Jumping in a bit late… people have to understand that HTML5 is made to “accept older doctypes and handle them”. In other words, the HTML5 doctype allows you to use almost anything you want, as long as it confirms to previous standards. Translated this means that you can take your XHTML1.1 document, slap the doctype line to a minimal and push it up to the server. HTML5 will be read.

About the “header” for the MIME type, I can not see why you would serve the document with anything else but “text/html”, because in fact, you are serving just that! Using HTML5 you can write xml or non-xml syntax, as long as the “html,head,body”-structure is kept. Geez, I know it is hard to accept that you don’t get a resticting doctype next, but hey… that’s what it’s all about; making HTML5 the doctype “to rule them all” by being open for old an even some new stuff. See, HTML5 is not about canvas and co, HTML5 is about wrapping up ALL previous doctypes and adding goodies like canvas etc. to it.

My 2 cents… get rich using them! ;)

Leave a Reply

You can use these tags: <a href="" title=""> <abbr title=""> <b> <blockquote cite=""> <cite> <del datetime=""> <em> <i> <q cite=""> <strong>

You can also use <code>, and remember to use &lt; and &gt; for brackets.