Quoting and citing with <blockquote>, <q>, <cite>, and the cite attribute

by .

NOTE: (6/11/2013) The definitions of cite and blockquote in HTML have changed. For the latest advice on using these elements refer to cite and blockquote – reloaded

Given HTML’s roots in the academic world, it should be no surprise that quoting is well-accommodated in the elements <blockquote> and <q>, with their optional cite attribute. In addition, there’s the <cite> element, which over the last nine years went from ‘semantic orphan element made good’ to one of the more contentious elements in HTML5. Let’s power up the endoscope and examine the scarring, starting with <blockquote>.

Quoting with <blockquote>

We’ve become pretty familiar with <blockquote> here, as most of our articles feature excerpts from the HTML5 specification. Look, here’s one right now:
The blockquote element represents a section that is quoted from another source.
Easy peasy, right? Nothing has really changed. Remember that as <blockquote> is a ‘block-level element’ (flow content) we can put most anything in it, including headers, images and tables, in addition to the usual paragraphs of text. There are a couple of slight differences in HTML5 though. <blockquote> is a sectioning root, meaning that any <h1>-<h6> elements it contains don’t become part of the document’s outline. Also, adding a single paragraph of text with no enclosing <p> tags is now completely kosher. Here are some simple <blockquote> examples (apologies for the fake content):
<blockquote>This is a short block quote — look Ma, no paragraph tags!</blockquote>
This is a short block quote — look Ma, no paragraph tags!
<blockquote><p>This is a longer block quote.</p>
  <p>It uses paragraph elements.</p>
</blockquote>
This is a longer block quote. It uses paragraph elements.
<blockquote><h1>OMG a heading!</h1>
  <ul><li>Block quotes can contain more than just paragraphs…</li></ul>
</blockquote>

OMG a heading!

  • Block quotes can contain more than just paragraphs…
Historically, adding the source of a <blockquote> was a semantic conundrum. If you add it as content of the <blockquote>, then semantically it would become part of the quote, right? <blockquote> (and <q>) have a cite attribute for the URL of the quote’s source, to provide context. That’s hidden data, however, and despite the potential for exposing the cite attribute via CSS and/or JS, that’s not as useful as a visible link.
  It seems our long-running convention at HTML5 Doctor of using <footer> for attribution inside a <blockquote> is actually non-conforming. However the phrase in the spec that prevents it also prevents other common block quoting patterns, so the spec will probably change. Read my article <blockquote> problems and solutions, and submit feedback via the WHATWG email list, the comments here or to be via Twitter (@boblet) — your feedback will influence how the spec changes! I’ll update this article after the change, but until then be aware <footer> for attribution in a <blockquote> isn’t strictly valid, and may not be in the future either. The spec currently recommends including attribution in content surrounding the <blockquote>.

Hixie has given his feedback on my email, and it seems like our <footer> citations are still invalid. The official recommendation is to put the blockquote in a figure and add attribution in <figcaption>. Read the whole thread as there are some interesting comments. I’ll wait for the dust to settle a little yet…

HTML5 comes to our rescue with the <footer> element, allowing us to add semantically separate information about the quote. For example:
<blockquote>
  <p>You know the golden rule, don’t you boy? Those who have the gold make the rules.</p>
  <footer>— Crazy hunch-backed old guy from the movie Aladdin</footer>
</blockquote>
You know the golden rule, don’t you boy? Those who have the gold make the rules.
— Crazy hunch-backed old guy from the movie Aladdin
Because of this semantically sound way to show the quote’s source, if you’re going to add a cite attribute on <blockquote>, only do so in addition to visible attribution.
<blockquote cite="http://www.imdb.com/character/ch0000672/quotes">
<p>You know the golden rule, don’t you boy? Those who have the gold make the rules.</p>
<footer>— <a href="http://www.imdb.com/character/ch0000672/quotes">Crazy hunch-backed old guy in Aladdin</a></footer>
</blockquote>
You’ve heard of the golden rule, haven’t you? Whoever has the gold makes the rules.
Even then, the cite attribute is probably only worth it if you can automate it (or you're just crazy OCD ;). We’ll return to <blockquote> in a bit, but let’s first probe into <q>.

Inline quotations with <q>

<q> is for quoting something inline within a section of prose:
The q element represents some phrasing content quoted from another source.
This means we can’t use <q> for sarcasm or other non-quotation uses of quote marks (“”). In those cases, add punctuation manually. The spec continues:
Quotation punctuation (such as quotation marks) that is quoting the contents of the element must not appear immediately before, after, or inside q elements; they will be inserted into the rendering by the user agent.
As with <blockquote>, you can also add a cite attribute with a URL for the quotation’s source (subject to the above caveats against hidden data). If you’re not using these extra features though, it’s a toss-up as to whether <q> is any better than just adding punctuation characters like “” as you type. Okay, let’s see some specimens:
Nested quotations:
<p>Luke continued, <q>And then she called him a <q>scruffy-looking nerf-herder</q>! I think I’ve got a chance!</q> The poor naive fool…</p>
Luke continued, And then she called him a scruffy-looking nerf-herder! I think I’ve got a chance! The poor naive fool…
Language-appropriate quotations (example updated, thanks Zev, Bertil, & Janus):
<ul>
<li lang="ja">彼女は<q>日本語に猫は『にゃん』と鳴く</q>と言った。</li>
<li><i lang="ja-latn">Kanojo wa <q>Nihongo ni neko wa ‘nyan’ to naku</q> to itta.</i></li>
<li>She said <q>In Japanese cats say <i lang="ja-latn">nyan</i></q>.</li>
</ul>
  • 彼女は猫は『にゃん』と鳴くと言った。
  • Kanojo wa Nihongo ni neko wa ‘nyan’ to naku to itta.
  • She said In Japanese cats say nyan.
A quotation using the cite attribute. Note that I’ve also included the cite attribute’s link in content so it’s accessible:
<p><a href="http://www.imdb.com/character/ch0000672/quotes">The Aladdin character Jafar</a> presents an eloquent treatise on the recent global economic meltdown when he states <q cite="http://www.imdb.com/character/ch0000672/quotes">You know the golden rule, don’t you boy? Those who have the gold make the rules.</q></p>
The Aladdin character Jafar presents an eloquent treatise on the recent global economic meltdown when he states You know the golden rule, don’t you boy? Those who have the gold make the rules.
Let’s examine how to style these elements next.

Styling <q> and <blockquote>

Historically, browser support has been patchy for controlling the punctuation used by <q>. Things have settled down now, so we can define nested, language-specific and even author-defined punctuation via CSS.
Browser support for styling quotes
IE Firefox Safari Chrome Opera
Default <q> punctuation¹ 8.0 “” ‘’ 1.5 “” ‘’ 2.0 "" 4.0? "" '' 1.0? "" '' 4.0 “” ‘’
quotes with Unicode escapes 8.0 1.5 5.1² 11.0² 4.0³
quotes with glyphs 8.0 1.5 5.1² 11.0² 4.0³
  1. Default <q> support requires on :before and :after
  2. Webkit support has been weak, with "" and '' hard-coded until Safari 5.1 and Chrome 11. Ref: WebKit bugs 6503 (fixed) and 3234 (new).
  3. Opera is buggy if you nest <q> to a greater depth than quote pairs in your quotes property (test case)
“Correct” punctuation is an intricate topic and varies depending on language, but it generally involves these characters:
Quote punctuation characters
Glyph Description Unicode escape Entity Mac Windows Linux
Left double quotation mark \201C &ldquo; Option-[ Alt+0147 AltGr+V
Right double quotation mark \201D &rdquo; Option-Shift-[ Alt+0148 AltGr+B
Left single quotation mark \2018 &lsquo; Option-] Alt+0145 AltGr+Shift+V
Right single quotation mark \2019 &rsquo; Option-Shift-] Alt+0146 AltGr+Shift+B
« Double left-pointing angle quotation mark \00AB &laquo; Option-\ Alt+174 AltGr+[
» Double right-pointing angle quotation mark \00BB &raquo; Option-Shift-\ Alt+175 AltGr+]
Single left-pointing angle quotation mark \2039 &lsaquo; Option-Shift-3 Alt+0139 -
Single right-pointing angle quotation mark \203A &rsaquo; Option-Shift-4 Alt+0155 -
Double low-9 quotation mark \201E &bdquo; Option-Shift-W Alt+0132 -
Single low-9 quotation mark \201A &sbquo; - - -
Narrow no-break space \202F &#8239; - - -

If you’re using the charset UTF-8 (and you should be), we recommend you use the actual characters if possible, rather than the Unicode escapes in CSS or the entities in HTML. You can enter most of these using the keyboard — e.g. “ is Opt-[ on Mac, Alt + 0147 on Windows, and AltGr + V on Linux. Avoid using ", ' or ` in place of “” and ‘’. The “narrow no-break space” is used inside French guillemets.

Most languages alternate between two kinds of punctuation as quotes are nested, such as “” and ‘’ in English. To specify nested quote pairs in CSS, we would write this:
q {quotes: '“' '”' '‘' '’';} /* opening followed by closing quote pairs */
/* The equivalent using Unicode escapes:
  q {quotes: '\201C' '\201D' '\2018' '\2019';} */
q:before {content: open-quote;}
q:after {content: close-quote;}
Unfortunately, browsers use the last quote pair in the quotes property for more deeply nested quotations. In addition, Opera will use the wrong quote characters if you have more nested <q> than your quotes property defines quoting levels for (Opera quotes bug test case). Make sure you have enough levels by repeating quote pairs as necessary:
/* four levels of nested quotes */
q {quotes: '“' '”' '‘' '’' '“' '”' '‘' '’';}
WebKit had "" and '' hard-coded in the browser stylesheet until Safari 5.1 and Chrome 11, which prevented q:before {content: open-quote;} and q:after {content: close-quote;} from working. The workaround is to define opening and closing punctuation manually, then override with open-quote and close-quote. While it’s a little more involved, that’s why we use this CSS on HTML5 Doctor:
/* for two levels of nested quotations */
q {quotes: '“' '”' '‘' '’';}
 /* extra content definitions for pre-2011 WebKit */
q:before {content: '“'; content: open-quote;}
q:after {content: '”'; content: close-quote;}
q q:before {content: '‘'; content: open-quote;}
q q:after {content: '’'; content: close-quote;}
  /* q in blockquote */
blockquote q:before {content: '‘'; content: open-quote;}
blockquote q:after {content: '’'; content: close-quote;}
blockquote q q:before {content: '“'; content: open-quote;}
blockquote q q:after {content: '”'; content: close-quote;}
  /* blockquote hanging opening quote */
blockquote:before {display: block; height: 0; content: "“"; margin-left: -.95em; font: italic 400%/1 Cochin,Georgia,"Times New Roman",serif; color: #999;}
A more traditional English <blockquote> style uses an opening quote character before each paragraph of the quotation and a closing quote character on the last paragraph. You can do that with this CSS, but you’ll need to use <p> for the <blockquote>’s content.
/* alternative blockquote style */
blockquote {quotes: '“' '”';}
blockquote p:before {content: '“'; content: open-quote;}
blockquote p:after {content: ''; content: no-close-quote;}
blockquote p:last-child:after {content: '”'; content: close-quote;}

When quoting a foreign language, we use the quotation marks of the surrounding language, so a Japanese quote in an English sentence still uses English quotation marks:

All he knows how to say in Japanese is 「わかりません」 “わかりません” (I don’t understand).

If you’re dealing with multilingual content, you can specify the quotes property per-language:
/* quotes for French, German (two kinds) and Japanese */
:lang(fr) > q {quotes: '« ' ' »' '“' '”';}
:lang(de) > q {quotes: '„' '“' '‚' '‘';}
:lang(de) > q {quotes: '»' '«' '›' '‹';} /* alternative style */
:lang(ja) > q {quotes: '「' '」' '『' '』';}
You can learn more about this CSS in the specification: CSS Generated Content Module Level 3. Okay, it’s time to put the rubber gloves on: <cite> is up next.

The rise and fall of <cite>

<cite>’s rise to stardom as the semantic super-element

In HTML 2 <cite> was used to indicate the title of a book or other citation. But in HTML 3.2 and HTML 4.01, <cite> was more loosely defined as
Cite: Contains a citation or a reference to other sources
We can define “citation” as:
  • a reference to authority or precedent,
  • a quotation that’s being cited, or
  • a mention of someone or something
And “a reference to other sources” is even less specific [cue Jaws music]. The HTML 4.01 spec’s examples were:
As <CITE>Harry S. Truman</CITE> said… More information can be found in <CITE>[ISO-0000]</CITE>
Sadly, an example of an academic-style citation wasn’t included. Some standardistas enthusiastically adopted <cite> for its semantics, with the high point being Mark Pilgrim’s epic “Posts by citation” (the results of which are now sadly 404’ed). In those heady days, <cite> was used in three main ways:
  1. To refer to a person, generally in connection with some reference or as the source of a quote:
    <!-- Not valid usage of cite in HTML5! -->
    <p>As <cite>George R. R. Martin</cite> wrote in…</p>
  2. To contain the title of a work being referred to or quoted from (what <cite> is now used for in HTML5 — thanks Dylan!):
    <p><cite>A Game of Thrones</cite>, by George R. R. Martin</p>
  3. To contain a full academic-style citation (title plus author, and maybe bibliographic information):
    <!-- Not valid usage of cite in HTML5! -->
    <p><cite>A Game of Thrones, by George R. R. Martin</cite></p>
    • Historically this might also be marked up as:
      <!-- Not valid usage of cite in HTML5! -->
      <p><cite class="book-title">A Game of Thrones</cite>, by <cite class="author">George R. R. Martin</cite></p>
    • or as Wikipedia does (with <cite> set to font-style: normal;):
      <!-- Not valid usage of cite in HTML5! -->
      <p><cite>George R. R. Martin, <i>A Game of Thrones</i></cite></p>
“Wow, what an all-rounder!” I hear you say. “Is there anything <cite> can’t do?” The dirty secret of all this is the <cite> element has historically been semantics for the sake of semantics. So far, the only non-site-specific application of <cite> is browser default stylesheets, which format it with font-style: italic;. This is not a bad thing, as using <cite> consistently on your own site allows you to do all kinds of fun stuff (as Pilgrim demonstrated). But in the past, it’s been used to refer to three related but quite different types of data: titles, full citations, and names. This makes web-wide use, such as by a search engine, tricky.

<cite> snorts too much semantics, checks into rehab

So, in HTML5 this semantic over-achiever has ended up with a more … prosaic definition:
The cite element represents the title of a work (e.g. a book, a paper, an essay, a poem, a score, a song, a script, a film, a TV show, a game, a sculpture, a painting, a theatre production, a play, an opera, a musical, an exhibition, a legal case report, etc). This can be a work that is being quoted or referenced in detail (i.e. a citation), or it can just be a work that is mentioned in passing. A person's name is not the title of a work — even if people call that person a piece of work — and the element must therefore not be used to mark up people's names
This restriction has been somewhat … unpopular. Arguments for using <cite> for names (now summarised on the WHATWG wiki) were addressed by Ian Hickson, who decided that historical use wasn’t enough to justify the wooly definition. Jeremy Keith’s 24 Ways article “Incite A Riot called for civil disobedience and HTML 4.01-style <cite>-ing, but the HTML5 spec has not changed. The in<cite>rs are irate that there are two use cases that <cite>’s new definition leaves semantically unfilled — to mark up speakers in a transcript or dialog, and to indicate the speaker or author of an inline quote (<q>). The HTML5 spec adds semantic insult to injury by saying:
In some cases, the <b> element might be appropriate for names; e.g. in a gossip article … In other cases, if an element is really needed, the <span> element can be used.
By better defining <cite>, we increase the odds of getting usable data from it, though we now need different methods to cover these other uses. For now, it seems that these use cases aren’t specific enough to warrant new elements. Note that <cite> was never a general-purpose element for marking up a person. The still-born HTML 3.0 did try to introduce the <person> element, but if you’ve ever used hCard to semantically mark up a person’s name, you’ll know that we’d need way more than just one element to do names justice. The POSH way of marking up a name is to use hCard (in microformats, microdata or RDFa), or just with a plain old link.

Get ex<cite>d and quote stuff

History and encouraging angry comments aside, let’s suture up and see some examples of HTML5-style <cite> action:
Use <cite> for a movie title:
<p><cite>Aladdin</cite> is a great movie, even after 73 viewings. Aren’t kids great?</p>
Aladdin is a great movie, even after 73 viewings. Aren’t kids great?
Even better, use <cite> with a link:
<p><cite><a href="http://en.wikipedia.org/wiki/Aladdin_(1992_Disney_film)">Aladdin</a></cite> is a great movie, even after 73 viewings. Aren’t kids great?</p>
Aladdin is a great movie, even after 73 viewings. Aren’t kids great?
Use <cite> for the title of a book (but not the author’s name):
<p><cite>A Game of Thrones</cite>, by George R. R. Martin</p>
A Game of Thrones, by George R. R. Martin
If you wanted to semantically indicate the author, you could use microformats, microdata or RDFa:
<p><cite>A Game of Thrones</cite>, by <span class="hcard"><span class="fn n"><span class="given-name">George</span> <span class="additional-name">R. R.</span> <span class="family-name">Martin</span></span></span></p>
A Game of Thrones, by George R. R. Martin
In this example, the author and book title are only connected by proximity. You could connect them more explicitly using the hProduct microformat, RDFa’s GoodRelations, or to really bleed on the edge even Schema.org. Note that you can’t use the now-Google-approved rel="author" attribute here, as George R. R. Martin is being referred to and isn’t writing the article. If you just wanted to style the author’s name, you could use <b class="author"> (gossip column style) or <span class="author"> with whatever CSS you like.

All together now

Oay, let’s start mixing things up on the operating table and show some examples of <cite> with <blockquote> and <q>:
A movie <blockquote> with <cite>:
<blockquote>
<p>You know the golden rule, don’t you boy? Those who have the gold make the rules.</p>
<footer>— Crazy hunch-backed old guy in <cite><a href="http://en.wikipedia.org/wiki/Aladdin_(1992_Disney_film)">Aladdin</a></cite></footer>
</blockquote>
You know the golden rule, don’t you boy? Those who have the gold make the rules.
— Crazy hunch-backed old guy in Aladdin
Adding the cite attribute to a <blockquote> (and its <footer>):
<blockquote cite="http://www.imdb.com/character/ch0000672/quotes">
<p>You know the golden rule, don’t you boy? Those who have the gold make the rules.</p>
<footer>— <a href="http://www.imdb.com/character/ch0000672/quotes">Crazy hunch-backed old guy</a> in <cite><a href="http://en.wikipedia.org/wiki/Aladdin_(1992_Disney_film)">Aladdin</a></cite></footer>
</blockquote>
You know the golden rule, don’t you boy? Those who have the gold make the rules.
A quote from a specification:
<p>I wonder if feedback on <code>&lt;cite&gt;</code> prompted this:</p>
<blockquote><p>A person's name is not the title of a work — even if people call that person a piece of work</p>
<footer><cite><a href="http://developers.whatwg.org/text-level-semantics.html#the-cite-element">HTML5 for Web Developers</a></cite></footer>
</blockquote>
I wonder if feedback on <cite> prompted this:
A person's name is not the title of a work — even if people call that person a piece of work
An academic-style journal citation:
<ol>
<li>The information capacity of the human motor system in controlling the amplitude of movement, Paul M. Fitts (1954). <cite>Journal of Experimental Psychology</cite>, volume 47, number 6, June 1954, pp. 381–391</li>
</ol>
  1. The information capacity of the human motor system in controlling the amplitude of movement, Paul M. Fitts (1954). Journal of Experimental Psychology, volume 47, number 6, June 1954, pp. 381–391
An academic-style book citation:
<blockquote>
<p>Citations … all include the following: author (or editor, compiler, or translator standing in place of the author), title (and usually subtitle), and date of publication.</p>
<footer><cite><a href="http://www.chicagomanualofstyle.org/">The Chicago Manual of Style</a></cite>, 15th Edition (Chicago: University of Chicago Press, 2003), 596</footer>
</blockquote>
Citations … all include the following: author (or editor, compiler, or translator standing in place of the author), title (and usually subtitle), and date of publication.

Conclusion

If you’ve made it this far, congratulations! You’ve now learned more about citing and quoting in HTML5 than you wanted to know ;) But don’t keep the knowledge to yourself — let us know in the comments what you think. We’d also love to hear how you’re using <blockquote>, <q>, and <cite> in HTML5. If you share your code snippets, remember to escape them!

Updates

  1. : It seems our long-running convention at HTML5 Doctor of using <footer> for attribution inside a <blockquote> is in keeping with the <footer> part of the spec, but not with the <blockquote> part. We’re investigating…
  2. : Hixie confirmed that our use of <footer> is currently non-conforming — <footer> can currently only be included in <blockquote> if it’s quoted content. However, the phrase “content inside a blockquote must be quoted from another source” also forbids other common changes and additions to block quotes, so I’m going to see if it can be changed.
  3. : I wrote an article <blockquote> problems and solutions about these problems, and initial feedback from Hixie is that there are legitimate issues, and he’ll review feedback via the WHATWG email list. So, please do that! You can also leave feedback in the comments here or ping me on Twitter (@boblet). Also, don’t miss Jeremy Keith’s excellent commentary “Citation needed” on the issue (watch out for those <cite>s though ;) )
  4. : I’ve added keyboard shortcuts for punctuation characters to the punctuation table, for default (typically US) keyboards.
  5. Added a note about Hixie’s reply to the article.

70 Responses on the article “Quoting and citing with <blockquote>, <q>, <cite>, and the cite attribute”

  • FWIW, validator.nu now shows a warning if you use the cite attribute:

    Warning: The cite attribute on the blockquote element is not supported by browsers yet.

    (See what I did there?)

  • Dylan Parry says:

    Great article. One thing though: the second example in the list of historical cite uses seems a bit odd. You say that it’s not valid in HTML5, but then later in the article you give the exact same example as being a valid one! Having read the example again and again, I’m pretty convinced it is a valid one…

  • zOMG, people actually made it to the comment form? I applaud you, sirs!

    @Mathias — oh yeah, I should have mentioned that. Thanks! Still, if you’re using it for your own nefarious purposes don’t let a validator warning stop you! ;)

    @Dylan — D’oh, I must have been in auto-pilot mode when I did that. Thanks to your eagle eyes it’s now updated (with a shoutout)

    Also, courtesy of a prompt from @iandevlin, here is the TL;DR version: <blockquote>, <q> and <cite> are cool yo. But <cite> is only for titles (not pplz). <blockquote>+<footer> ftw! ;)

  • Zev Goldberg says:

    Nice article! Correction: The Japanese language example has an extra q end tag.

  • Very good article! I had missed the part about footer inside of blockquote before.

    But I’m not sure about your ”nyan”-example:

    In Japanese cats say <q lang="ja-latn">nyan</q>

    Is that really a proper use of the q-element? What’s the source of the “quotation”? The Japanese language?

    I think we should use this:

    In Japanese cats say <span lang="ja-latn">nyan</span>

    Or maybe this:

    In Japanese cats say <i lang="ja-latn">nyan</i>

    And then we might want to add quotation marks as well.

  • I just noticed some cleaver use of scoped style-elements in the article. Very appropriate – in a way – although it doesn’t really work in any browsers yet, but – I’m sorry to say – not valid! At least that’s what the W3C validator says (among other things).

  • Kroc Camen says:

    Glad to see a solution to this. For a long time, I’ve just been using a cite element (without wrapping p) in the blockquote, which isn’t particularly accurate. I can update ReMarkable to use this method instead.

    For some personal reflection on practical use of the abbr, dfn and cite elements (which all quickly fall into the “semantics for the sake of semantics” problem you describe), see my article Me, Myself and I — or: Abbreviations, Definitions & Citations Revisited” http://camendesign.com/abbr_redux

  • @Zev & Bertil — I must be slow today, have edited that example six times so far >_>

    @Bertil — I think <style scoped> is in WebKit nightlies now (if not close)

    @Bertil & Kroc — see the addition at the top of the article. I need to email the list about it as IRC feedback was a little divided. There’s a decent argument for this pattern, so now I have to make it :) Also, good article Kroc. I read it while writing this.

  • I had a quick discussion with Oli about using the Person Schema with blockquote to add some semantic meaning to the Author’s name.

    You can check out my original gist and Oli’s fork which includes a few improvements and comments.

  • Sadly I’m one of those who used the <cite /> incorrectly for the attributing the person to the quotation… time to go back and fix that one!

    Thanks for the clarification, now bookmarked for the reference and reminder when needed!

  • Oli Studholme:

    Also, good article Kroc. I read it while writing this.

    Indeed. I’d like to see that article by Kroc here at HTM5 Doctor. It would fit in very well. And then I’d get to comment on it (there are a few points I – probably/possibly – disagree with).

  • Kroc Camen says:

    The suggestions in my article came directly from writing and editing a few megs worth of raw text used on my website, which bought up lots of edge-cases and curious questions about semantics; so whilst I wouldn’t say my choices would suit everybody, they have at least been trialled in a background of text.

    My complaint about the ABBR article you published here on HTML5Doctor was essentially that you weren’t following your own advice, as I know that I practically went insane trying to use those rules on megs of text before I came up with my own to take back control of my sanity.

    But, I will definitely say that cite still remains the weaker out of the three and I appreciate this article for being far more square.

    If you would like, my article could be further adapted withfeedback from the doctors to better suit a broader audience. I strongly believe that a key part of learning HTML5 is learning HTML4 *properly* and eschewing spans and divs for semantics where possible

  • What I can’t seem to get my head around, is whether links should go inside or outside of the <cite>-elements. I.E., should you write

    <cite><a href="http://en.wikipedia.org/wiki/Aladdin_(1992_Disney_film)">Aladdin</a></cite>

    rather than

    <a href="http://en.wikipedia.org/wiki/Aladdin_(1992_Disney_film)"><cite>Aladdin</cite></a>

    I lean towards the second, as you cite the work, and you link to that.

  • karl says:

    The naughty question…

    because HTML5 “forbids” us (which I have never respected) to use the cite for author, how do we declare the author of something. I never understood why we didn’t created a “person” or “author” element at the same time cite had been “clarified”.

  • @William,

    That depends on whether or not you are citing the link or linking the citation.

    If you are quoting the movie Aladdin and linking to the Wikipedia article,

    <a href="http://en.wikipedia.org/wiki/Aladdin_(1992_Disney_film)" rel="nofollow"><cite>Aladdin</cite></a>

    Here, the citation is being linked to the Wikipedia article.

    If you are quoting the Wikipedia article for the movie Aladdin:

    <cite><a href="http://en.wikipedia.org/wiki/Aladdin_(1992_Disney_film)" rel="nofollow">Aladdin</a></cite>

    Here, the link is being cited.

  • Dylan says:

    Great article Oli, thanks!

  • @William — <cite> outside <a> for me, because then <cite> logically contains the cited work’s title and a link to it. This’d give you better info if you e.g. make a script scraping for <cite>d works. However Charles’ take on it is also good.

    @Karl — re-read the last paragraph of <cite> snorts too much semantics, checks into rehab. Answered your q’s there.

    @all — I’m following up on using <footer> in <blockquote>. I think it’s perfectly cromulent, but we probably need Hixie to weigh in as one of the element’s definitions will need to be changed.

  • Aniket Pant says:

    Loved the article. It’s like “perfect”.

    I didn’t know all this. Now I will use this method to make my blockquotes and cites.

    Thanks for writing this :)

  • Massive article, i would say blockquote is greatly artistered is this article.

  • Wolf says:

    While I applaud the noble efforts to make the web more semantic I am sorry to break the bad news: only standaristas who blog will ever correctly use the blockquote, q and cite elements. The rest of the world has trouble with the correct usage of even the most basic of HTML elements.

    And even then the standarista community can’t agree what is the correct usage and what’s not. This is similar to the header and footer elements that are so generic people will use them for all kinds of purposes, thus obliterating any real-world usage (e.g. when writing screen-reading software)

  • karl says:

    @Wolf,

    you really didn’t break a big secret. What is interesting in this discussion is not if people will markup by hand their html for quotes. They will not. OK. This is set.

    Let’s move on.

    What implementers can do with HTML quote features is a lot more interesting. A few use cases.

    A. Authoring

    1. Select a text on a Web page, bookmarklet to publish on a blog. I did it. A prototype

    2. Select a piece of book in a ereader, and publish to your blog, with all the markup already done.

    3. Mailing list threading

    4. Online forum threading.

    B. Indexing/Parsing

    1. Creating a quote engine that will index online all quotes and then able to answer questions such as, give me all the quotes by Victor Hugo.

  • Wolf says:

    Karl, the problem is that if the markup pattern is not widely used, writing software that depends on it is useless. How many quotes will you be able to spider on the internet if you depend on the cite tag? Not many. All of your examples rely on correct usage.

    Of course this is a chicken and egg problem. If HTML would be taught properly in schools in the future etc. etc.

  • @Wolf — I agree with you that the problem is education, so I’m puzzled by your cynicism in your comments above. That’s what we’re trying to do on HTML5 Doctor, after all! :) Also, you missed two very large and important groups: software (e.g. DreamWeaver) and tool makers (e.g. editing toolbars in CMSs), and service providers (e.g. academic journal tools). I try to follow the advice a wise sage once gave me, just do the best you can. In some small way I hope people reading this article can benefit from it!

    @All — I’ve made some changes to the article regarding the ongoing <footer> in <blockquote> saga. It looks like the spec will change, so please chime in with your feedback and ideas!

  • lizlux says:

    Once again, you’ve made my day. Thanks for the great info!!

  • Cite is simply one of the most problematic tags in HTML5 at the moment. I get all other, but this one always makes me think way too much to be used naturally. I’m currently working out the system to semantically markup works cited in a few of my articles, but it seems so damn complicated with all those changes and “cite wars” … duh.

    Also I notices recently that wikipedia is using normal ol/li for their references as they often just link to sources without naming them.

  • Jon says:

    Oli,
    Just grabbed the styling you’ve placed above and had all sorts of issues with special characters. Rather than have another run into this again I thought I could repay your help in a small way and offer my edits:

    /* for two levels of nested quotations */
    q {quotes: '\201C' '\201D' '\2018' '\2019';}
     /* extra content definitions for pre-2011 WebKit */
    q:before {content: '\201C'; content: open-quote;}
    q:after {content: '\201D'; content: close-quote;}
    q q:before {content: '\2018'; content: open-quote;}
    q q:after {content: '\2019'; content: close-quote;}
    	/* q in blockquote */
    blockquote q:before {content: '\2018'; content: open-quote;}
    blockquote q:after {content: '\2019'; content: close-quote;}
    blockquote q q:before {content: '\201C'; content: open-quote;}
    blockquote q q:after {content: '\201D'; content: close-quote;}
    	/* blockquote hanging opening quote */
    blockquote:before {display: block; height: 0; content: "\201C"; margin-left: -.95em; font: italic 400%/1 Cochin,Georgia,"Times New Roman",serif; color: #999;}
    

    Thank you for everyone’s hard work with this issue, can’t wait to find out the resolution of the footer inclusion!

  • @lizlux — we live to serve :) happy to hear it helped

    @Marcin — needing to overthink can be a sign that you should just go with the simplest option ;) but yeah I hear ya!

    @Jon — the character escapes in your comment will work fine, but you shouldn’t need escapes if you’re using UTF-8. Check your page’s encoding and confirm the browser is getting UTF-8. Check your text editor is saving UTF-8. Finally I always use:

    @charset "utf-8";

    as the first line of my CSS files. If you have a 100% UTF-8 workflow you should only need character escapes for characters with special meaning in CSS, which Mathias Bynens covers in CSS character escape sequences.

  • Hixie has given his feedback on my email, and it seems like our <footer> citations are still invalid. The official recommendation is to put the blockquote in a figure and add attribution in <figcaption>. Read the whole thread as there are some interesting comments.

    I’ll wait for the dust to settle a little before updating this article (and possibly every spec quote on HTML5 Doctor :| )

  • Jon says:

    Oli,
    I’m a bit confused as I’m using the H5BP template framework and the CSS file included? I can see the meta charset is set to utf-8 there and in both the web.config and .htaccess? However, this comment is in the web.config, should this be changed?

    <!-- use utf-8 encoding for anything served text/plain or text/html -->
    <remove fileExtension=".css"/>
    <mimeMap fileExtension=".css" mimeType="text/css"/>


    Thanks nonetheless, all help is greatly appreciated!
    J

  • @Jon — The settings in your text editor and .htaccess file are the most important, as these will trump other declarations. Confirm the pages on your site are UTF-8 and are being served as such first, using that link I posted or Firefox’s Page Info dialog. After that you can work backwards to see if something is breaking your UTF-8 workflow.

  • Janus says:

    That academic-style journal citation is all kinds of trouble. How are you supposed to distinguish between:
    <cite> for the name of the article being cited (which should—in my reference style of choice—be recte and “in double quotes”);
    <cite> for book names (which should be italicised and not quoted);
    <cite> for journals (of which only the journal name itself should be italicised and quoted, with issue number, etc., being recte and not quoted); and
    <cite> for monograph series (which should be recte and not quoted)?

    What a jungle!

    In your example here, only the name of the journal is <cite>d. That means the article referred to is not marked up as a cited work at all; and also that identifying information such as volume number (which is necessary to identify the source) is put outside the citation.

    If HTML5 wants to cater to academic citations, it needs to get its working hat on and add something like a type attribute (<cite type="article/monograph/journal/series/etc.">) that we can style our CSS by. And it needs to come up with an <author> and/or <source> tag, too. :-]

    @Bertil:

    But I’m not sure about your ”nyan”-example:

    […]

    Is that really a proper use of the q-element? What’s the source of the “quotation”? The Japanese language?

    I’d say that’s valid and proper enough—the source is, of course, the cat! ;-)

    P.S.:

    @Oli:

    彼女は猫はにゃんと鳴くと言った。
    Kanojo wa Nihongo ni neko wa nyan to naku to itta.

    See anything missing there? ;-)

    (Hint: 彼女は日本語に猫はにゃんと鳴くと言った。)

  • Janus says:

    (Ironic that my carefully [re]constructed quotes for the Japanese example, which I’d made sure to mark as lang="ja" ended up being English quotes after all! Is lang stripped/not allowed from comments?)

  • @Janus — thank you for taking the time to format you code :)

    How are you supposed to distinguish between…

    The class attribute for styling, microdata or RDFa for semantics.

    That means the article referred to is not marked up as a cited work at all; and also that identifying information such as volume number (which is necessary to identify the source) is put outside the citation.

    What would you expect to happen if this information was marked up? What is the use case, and what benefits would you get?

    If HTML5 wants to cater to academic citations

    I suspect academic citations of the type you’d like would be outside HTML5’s scope, but they can be addressed with microdata or RDFa.

    like a type attribute that we can style our CSS by

    The class attribute suffices for this.

    it needs to come up with an <author> and/or <source> tag, too

    re: <author>, Hixie has mentioned a potential “credit” element for attribution, but is waiting to see how <figure> is used. For authorship you could use <b> or <span>, with semantics added via e.g. the hCard microformat or microdata.

    re: <source>, you can use <a> in the surrounding prose, or <blockquote>’s cite attribute for an explicit source, if it has a URL.

    Regarding that accursed cat example, I think I’ve corrected the mistake you spotted and fixed your comment (let me know if not), and will add lang to the allowed attributes list. Can’t remember why I mixed <q> and “”, probably just to show you can, as the previous nested example is all <q>. damn uuu neko-chan! :)

  • Janus says:

    The class attribute for styling, microdata or RDFa for semantics.
    […]
    What would you expect to happen if this information was marked up? What is the use case, and what benefits would you get?

    Well, what’s the purpose of marking something up with cite to begin with? Styling is one thing, of course, but semantically, in the given example, what’s being cited is the article, not the book. The book is what you need if you want to read the article.

    I guess the use case would mostly be for something like extracting HTML5 markup to XML or something similar, where only the citations themselves would be kept. I admit, it’s a bit far-fetched, and probably outside the scope of HTML5. But for scholarly publishing online, it would be great if it were there.

    If ever I do need to venture in that direction, it looks like I’ll have some reading to do, catching up (read: learning the basics) on RDFa, hCard microformats, microcards, etc.—all technologies that I am woefully ignorant of.

    (As for poor 猫ちゃん, the mistake I spotted was simply that the word[s] “日本語に” was/were missing in the kanji version of the examples, but present in the rōmaji and English versions.)

  • @Janus —

    what’s the purpose of marking something up with cite to begin with?

    Indeed :) There are things we could do with formatted academic citations, but currently I’m unaware of anyone actually doing anything, which makes <cite> merely a semantic and medium-independent way of indicating the work being cited. Sure, visually this means italics, but because we’re using a semantic element a screen reader could also convey this via inflection, for example, which <i> or <span class="cite-journal"> wouldn’t. Once people start actually doing stuff (= real world use cases), there’ll be some cowpaths for WHATWG to consider paving.

    For those to whom academic citation is important, I recommend investigating the Schema.org vocabularies for Article and Book (microdata), or the Bibliographic Ontology specification (RDFa). There’s also the citation microformat working page. We’ve got articles introducing microdata and microformats here too.

    Thanks again re: that cat.

  • I don’t get it. Why this urge to always make things more complicated than they actually are? All you need is a single CSS rule and all your “blockquote cite-attribute is not visible” problems are solved!

    blockquote:after { content:attr(cite); }

    Yep, life can be this simple…

  • Alex says:

    How should I cite a person as the source of a quote? Is this valid?

    Palawan
    “Think of secret lagoons, unexplored coves, sparkling turquoise waters, fine white sand, spectacular limestone karst, fresh seafood and lovely people - these things are just a fraction of what you can experience in El Nido. Coron is a one of the best diving destination in the world. They have the cleanest lake and some amazing landscape and seascape. Puerto Princesa City on the other hand offers one of the New 7 Wonders of the World. The Underground River tour is just out of this world.”

    James Betia

    What would be the best practice for semantic markup in this case?

  • Matěj Cepl says:

    Shouldn’t the name of the article be <cite>d as well, like with:

    <ol>
    <li><cite>The information capacity of the human motor system in controlling the amplitude of movement</cite>, Paul M. Fitts (1954). <cite>Journal of Experimental Psychology</cite>, volume 47, number 6, June 1954, pp. 381–391</li>
    </ol>

  • Johan Dam says:

    Wasn’t there discussion about a <main> tag? If that gets added, it’s a great place to store the actual quotation in a quoteblock,

    You know the golden rule, don’t you boy? Those who have the gold make the rules.

    - Crazy hunch-backed old guy from the movie
    Aladdin

  • fjpoblam says:

    Somewhat off topic, or maybe not. I’ve seen it argued that the citation should be placed *outside* the <blockquote> inasmuch as the citation isn’t strictly part of the material being quoted. Fussy semantics.

  • S.S. says:

    This is probably a dumb question, and forgive me if it’s been asked in the comments already (I figure the difficulty of ignoring me is a lot less than the difficulty of reading the entire comment thread), but why not suggest that the HTML5 spec define <cite> to be use case 3 rather than use case 2, with the subparts of the name of the work and the name of the author each optional but another pair of tags for them? E.g. >cite<>hypothetical-author-tag<John Doe>/hypothetical-author-tag<, >hypothetical-title-tag<Reconciling Purity and Practicality>/hypothetical-title-tag<>/cite< or >q<…you can fool some of the people all of the time…>/q<>cite<gt;hypothetical-author-tag<Abe Lincoln>/hypothetical-author-tag<>/cite< or >cite<>hypothetical-title-tag<An Example of the Original Purpose of the Cite Tag>/hypothetical-title-tag<>/cite< Would something like this not satisfy both the purist specifiers who object to cite being used for three different things and the practical authors who object to effectively being told that citations other than the name of a work are semantically irrelevant? Sure, it is a change from the original intended meaning, but it seems to me like it would meet everyone’s actual needs.

  • S.S. says:

    Oops, not used to writing < and > as &lt; and &ampgt;, managed to get most of them backwards… let’s try again…

    <cite><hypothetical-author-tag>John Doe</hypothetical-author-tag>, <hypothetical-title-tag>Reconciling Purity and Practicality</hypothetical-title-tag></cite> or <q>…you can fool some of the people all of the time…</q><cite>lt;hypothetical-author-tag>Abe Lincoln</hypothetical-author-tag></cite> or <cite><hypothetical-title-tag>An Example of the Original Purpose of the Cite Tag</hypothetical-title-tag></cite>

  • Balaji says:

    Kudos to this article author. Such a great insight on different kinds of usage of blockquote element. This is a nice read.

    For those who need to understand basic blockquotes, here is my article Using quotes and blockquotes.

  • Harry Alffa says:

    On “extra” (or extraneous!) info within blockquote.
    Use an aside tag to stick in references etc?

  • Harry Alffa says:

    Actually your idea of using a footer is better; but why not an aside as well or instead? To much clutter/confusion?

    Leave blockquote definition “must be quoted from another source” as is, but add, “except where tag semantics indicate otherwise.”

    The clutter/confusion will inevitably be expressed somewhere.
    The current spec just means; not in blockquote! ‘Cause its saying if your going to use blockquote so it looks nice and gives meta-information you want to enclose it in a figure tag, and use figcaption for the stuff about blockquote.
    Basically the spec is inventing a compound tag with this suggestion, which will lead to clutter/confusion, and questions like; why can’t I put all this stuff in blockquote?

  • Ian Y. says:

    When a quotation is inside an article, it is reasonable to use figure as the outer wrapper and figcaption as the attribution wrapper.

    But how about a list of quotations which are not inside an article? They don’t serve the purpose of illustrating/annotating the context (because there is no context), so figure and figcaption can’t be the wrappers. How do we mark up a list of quotations in such a case?

  • I agree with Matěj Cepl about the academic citation. “The information capacity of the human motor system in controlling the amplitude of movement” should be in <cite>, since that is the work being cited. You wouldn’t, in general, cite an entire journal. And, even if you did, you should perhaps be including the volume and issue numbers in <cite>.

  • I don’t know if you noticed or not, but as of this writing, the ‘blockquote + figcaption inside figure’ style is given as an example on the HTML5 blockquote spec page. Guess this is the best option we’ve got for now.

  • Something terrible has happened! Or possibly something great. I’m not sure.

    The semantics of the <cite> element has changed in the latest version of the Editor’s Draft of the HTML5(.1) specification. Now it says

    The cite element represents a reference to a creative work. It must include the title of the work or the name of the author(person, people or organization) or an URL reference, which may be in an abbreviated form as per the conventions used for the addition of citation metadata.

    and the following example is given:

    <p>In the words of <cite>Charles Bukowski</cite> -
    <q>An intellectual says a simple thing in a hard way. An artist says a hard thing in a simple way.</q></p>

    Hence, now it is again allowed to use <cite> to mark up names, at least when referring to an author of a ‘creative work’.

    Why has this been changed? Will the change survive? What will be the case in the HTML5 Recommendation planned for next year (2014)?

  • Thank you for your response, Steve.

    I think I prefer the old meaning of the <cite> element (the one only permitting titles). The reason is that I am really fond of the exactness, cleanness, simplicity, and ease-of-use that version brought to the HTML5 specification.

    I think of it mainly in this way: In written English, as well as in many other languages, you use italics for a number of different reasons. In hypertext documents, you can embed information about the precise reason why a phrase is in italics:

    * <em&gt for stress emphasis, possibly the most common case. Often alters (or clarifies) the meaning of the text.
    * <cite&gt for titles (books, articles, films, …).
    * <dfn&gt for the defining instance of a term.
    * <i&gt (with class attribute) for phrases in languages different from the language of the surrounding text.
    * <i&gt (with class attribute) for Latin names of species.
    * a few other cases.

    I think this is wonderful.

    In this case, it is extremely easy to use <cite> right — it’s used for titles, nothing else. It also makes perfect sense to have a specific element for this purpose, since you usually want to display titles in a different way visually (typically in italics), so you certainly need some markup here. And why not make it semantic, and make the wonderful list above come true?

    As noted in the article (to which this is a comment), HTML 2.0 gives exactly this meaning to the <cite> element. Unfortunately, the text in other HTML specifications and proposals is more diffuse: In HTML+, an academic citation is given as the only example. In HTML 3.0, the text talks about ‘citations’ and ‘italics’, and the only example is a book title. In HTML 3.2, ‘citations or references’ is used to describe the element. The same applies to HTML 4.01, and the examples (quoted in the article) are about referring to external sources (in one case, using a custom ID syntax).

    Personally, I feel that the use of <cite> in “my” way (the HTML 2.0 and the original HTML5 way) is semantically different from the one about giving ‘references’ to external sources. (It also happens to be two different things typographically: titles are usually in italics, while ‘references’ might be given in various formats.) So I don’t really like the idea to allow both of these uses: it would make the element, essentially, meaningless from a semantic point of view. (And even if we ignore the part about semantics, it would not be painless to use the element in actual web pages when it comes to the default formatting, since book titles should be in italics while names of people shouldn’t, generally.)

    Marking up titles is important, generally. For instance, the text

    “Violence is good.”

    doesn’t mean the same thing as

    “<cite>Violence</cite> (=a book or an article, perhaps?) is good.”

    It’s a feature of written human language, essentially, that titles should be marked up in some way, in the same vein as stress emphasis and defining instances. Giving references in an academic paper, or referring to a person or URL, doesn’t feel like the same thing (and generally requires different formatting). At least not exactly.

    Additionally, the current HTML5 text apparently considers it to be a ‘creative work’ if a person mutters something to himself.

    I guess my main points are the following:

    The restrictive version of the definition

    * is much easier to apply right;
    * has a very precise semantics; and
    * is very useful in practice, since you do need some markup to affect the formatting.

    The new version of the definition

    * is used for many different cases, and it is not as easy to tell if a borderline application is appropriate or not;
    * can mean a number of similar, but not identical, things; and
    * can be useful in practice, but you might need different classes to make only some of the instances be rendered in italics.

    Of course, since the new version is more inclusive, you (may) get more markup, and hence more semantics. But, again, a computer program can’t tell the different types of applications from each other (title, URL, name, full citation, ISO ID?). Well, they are all references, in some sense, so I certainly agree you do gain something, but we will “never” be able to mark up everything anyway. Do we really need an element for marking up every possible ‘citation’ or ‘reference’ in a broad sense? Well, certainly not for class-less formatting, at least.

    Sorry my post became so long…

  • Hi Andreas,

    I think I prefer the old meaning of the element (the one only permitting titles). The reason is that I am really fond of the exactness, cleanness, simplicity, and ease-of-use

    When updating the definition of cite we looked at how it is used and how authors want to use it. I understand the allure of a definition that restricts cite to titles of works, but it has not and is not being used for this purpose in the majority of instances. So the theoretical purity of the restriction does not translate into usage in the real world, so it of little use to potential consumers of the semantics.

    you usually want to display titles in a different way visually (typically in italics), so you certainly need some markup here.

    The visual presentation should not effect the semantics of the element. if you look at how search engines such as Google (used for URLs in search results not italicized) or Bing (used for URLs in search results not italicized) use cite, they override the default visual presentation. Does this mean its no longer a citation?

    * is used for many different cases, and it is not as easy to tell if a borderline application is appropriate or not;
    * can mean a number of similar, but not identical, things; and
    * can be useful in practice, but you might need different classes to make only some of the instances be rendered in italics.

    It is used for one case:

    The cite element represents a reference to a creative work
    http://www.w3.org/html/wg/drafts/html/master/text-level-semantics.html#the-cite-element

    The information that reference consists of is broader, but its broadness means that it actually reflects usage in the real world.
    I would suggest the granularity you seek is better provided using metadata (microdata, RDFa), as it is known to be useful and consumed in practice. Also the use of classes to provide such granularity is fine and encouraged if its useful for the author:

    Authors can use the class attribute to extend elements, effectively creating their own elements, while using the most applicable existing “real” HTML element, so that browsers and other tools that don’t know of the extension can still support it somewhat well. This is the tack used by microformats, for example.
    HTML 5.1 – 2.2.3 Extensibility

  • Thank you again for your time, Steve.

    I do understand your arguments and they certainly make sense. I suppose, to some extent, this it is a choice one has to make: either you make the standard theoretically simple, beautiful, and logical, or you make the best you can without ‘breaking’ millions of existing hypertext documents.

    If I understand this correctly (yes, for some reason I do find the new text harder to understand), the new version is strictly a ‘superset’ of the old one, in the sense that every valid use of <cite> according to the last version is also valid according to the new version? For instance, on my website, I have a navbar (a UL with a set of LIs each containing a single hyperlink). One of the links is <cite>Ändlös längtan</cite>; this link will take you to the homepage of my (Swedish-language) book Ändlös längtan. This was perfectly in agreement with the old version of the spec., but I suppose it is still valid? Although the navbar doesn’t contain a quote from the book, this LI certainly is “a reference to a creative work [that, in addition,] include[s] the title of the work”.

    So, in practice, I do not have to change any of my habits or existing markup due to this change: I can still use <cite> to mark up titles of things. I don’t have to use it to mark up names next to quotes, etc. (although I certainly could start doing that now).

    I just recalled an earlier version of the HTML5 text from 2008. Here’s yet another version of the meaning of <cite>:

    The cite element represents a citation: the source, or reference, for a quote or statement made in the document.

    […]

    This is […] wrong, because the title and the name are not references or citations:

    <p>My favourite book is <cite>The Reality Dysfunction</cite>
    by <cite>Peter F. Hamilton</cite>.</p>

    According to that text, my habit of marking up titles would be wrong. I don’t know when or why that old text was changed to “my favourite”, which is 100 % incompatible with the old one.

    In terms of valid types of applications, it seems like the new version (maybe we should call it “the third one”?) includes both the old one (the one from 2008: “the source, or reference, for a quote or statement made in the document”) and my favourite (titles of stuff), as well as other kinds of ‘references’ to ‘creative works’ (including sounds someone mutters to himself). Is it correct to say that the new version is the most inclusive one ever in an HTML specification?

  • Hi Andreas,

    (yes, for some reason I do find the new text harder to understand), the new version is strictly a ‘superset’ of the old one, in the sense that every valid use of according to the last version is also valid according to the new version?

    Correct. If the text is unclear please do file a bug on the HTML spec, it’s an editors draft and there for people to comment on and help improve.

    So, in practice, I do not have to change any of my habits or existing markup due to this change: I can still use to mark up titles of things. I don’t have to use it to mark up names next to quotes, etc. (although I certainly could start doing that now).

    correct.

    There is an example in the spec:

    <p>Who is your favorite doctor (in <cite>Doctor Who</cite>)?</p>

    Is it correct to say that the new version is the most inclusive one ever in an HTML specification?

    Well, I would like to think it’s inclusiveness is better defined and explained than previous versions which were somewhat musrky.

  • Heydon Pickering says:

    Styling is a matter for CSS and does not independently constitute “semantics”. However, it is advisable to style HTML elements according to their semantic differentiation within the document. Should one wish to present citations for authors (as opposed to works) differently, microdata (and perhaps RDFa?) has itemprop=author. This is equivalent to the rel=author link relation but not restricted to hyperlinks.

    The CSS attribute selector is as follows:

    [itemprop="author"] {}

    Or, for citations that include links to author pages:
    cite [rel="author"] {}

    As Steve demonstrates, uses of the cite element are many and varied. This is in accordance with its broad remit in the English Language, which includes works and authors.

    <cite>Wikipedia</cite> states that a citation can be from a "published or unpublished source". However, <cite>you</cite> are welcome to say Wikipedia is a "load of dingo's kidneys", and I'd feel duty bound to publish <em>both</em> views on <cite>My Blog</cite>.

  • @Heydon Pickering: I’d use However, <em>you</em> are welcome...

    It’s stress emphasis. I don’t know what ‘creative work’ “you” is referring to.

  • Bart says:

    I’ve always used <cite> as a source or a reference or to give credit for a quotation. Never simply to mark up a title unless it was a source. So to me this, “<cite> Aladdin</cite> is a great movie, even after 73 viewings. Aren’t kids great?,” makes no sense. If <cite> was strictly for titles, why not use <title> instead? Isn’t that more semantic? Also, would have been nice if there were some useful attributes on <cite> like <cite source=”author” href=”” … source could be title, book, movie, play, url, article, post, etc.

    Sorry about the re-post.

  • Bart says:

    For the <blockquote>, why not redefine <blockquote> to be the container and <q> the actual quote, followed by a <cite>? For example,

    <blockquote>
    <q>
    <p>This is a longer block quote.</p>
    <p>It uses paragraph elements.</p>
    </q>
    <cite>HTML5 Doctor</cite>
    </blockquote>

  • Rich says:

    @Bart

    That’s surely unnecessary markup?

  • Alohci says:

    @Rich – Why is it “surely” unnecessary? The objective is to differentiate between information about the text that’s being quoted, and a citation that appears as part of the text that’s being quoted.

    Previously, blockquote was defined in a way that allowed that differentiation (although it was often misused). HTML 5.1 changes the blockquote definition in such a way that it is no longer even possible.

  • @Alohci

    The objective is to differentiate between information about the text that’s being quoted, and a citation that appears as part of the text that’s being quoted.

    do you have any data as to the frequency of this occurring?

    I suggest that the change in the HTML spec provides a much greater benefit as it covers a very frequent (57% of the time) pattern of citation information being included in a blockquote.

  • Alohci says:

    @Steve – No I don’t. It’s just a principle. My guess is that it’s pretty rare.

    I guess I don’t really understand what the problem is that the change is trying to solve. If it’s just for the sake of paving a cow path then I doubt its utility, but I’m not particularly opposed to it.

    I suspect (again, no data) that the blockquote indentation rule tends to encourage authors to put the cite inside the blockquote, so I wonder whether an HTML5 endorsed way of permitting that might be sufficient to effect a change in authoring practice. Using the footer element inside the blockquote to contain metadata about the quote, including its citation seems both natural and pretty harmless since there should be no need to use footer in the quote itself. Accordingly, my preference would be that the blockquote change was limited to permitting that.

  • Lelala says:

    Wow, didn’t even know there is a *that* simple option to put beautiful quotes in the content, yes! :-)
    Regards

  • Ginlan says:

    Hey there, I was wondering if this is correct use for a list of blockquotes (as on a homepage testimonials slideshow) :

    <ul>
    <li>
    <blockquote>
    <p>The text</p>
    <footer>The name</footer>
    </blockquote>
    </li>
    <li>
    <blockquote>
    <p>The text</p>
    <footer>The name</footer>
    </blockquote>
    </li>
    <!– … and so on –>
    </ul>

    >> Note to moderators, please erase previous comment :)

  • Ginlan says:

    Thanks Dr. Steve

  • Gugone says:

    Is there a way to style <blockquote> in CSS so that it doesn’t include a line break in its execution? I don’t want extra lines between main text and blockquote.

  • Patanjali says:

    Regarding the use of a reference inside a blockquote element, I think contextual information should be outside, preferably as part of the part of the introduction, such as in ‘Regarding the blockquote element, the W3C HTML5 specification states:’ before the blockquote for your first example.

    I think this context-after approach is part of the overall inside-out approach to information presentation that has been inherited from print, when it consisted of large tracts of text, interspersed with pictures and tables placed in convenient places for layout, but not necessarily near the text to which it relates, and often not even on the same page.

    To me, the more sensible way of presenting figures, tables and lists is to introduce them, giving its context, and optionally what the reader should look for in it. This is because if the figure or table is a part of the narrative, then it should be presented right at the appropriate context point.

    It would also keep the reader focussed on why the figure or table is being presented. A picture may be worth a thousand words, but unless you guide a reader in what to look for, they may well pick a thousand words that don’t fit with your purpose for the element.

    Also, people don’t want to waste their time, so they make very quick decisions about whether what is being presented needs to be read/viewed or not, so introducing its context helps them to make that decision.

    Some examples of introductions are:
    – ‘A young red pine, with visible roots due to soil erosion:’
    – ‘To identify yourself, bring any two of:’
    – ‘The damage to the building, as viewed from the south-east corner, is:’
    – ‘The sales for last five years, with year-on-year downturns shown in red, are:’.

    I know we have been trained to look first, then read the caption, but I think introducing context BEFORE a mass of visual information serves readers better. The reader will then be more aware of the context that text after the element is focused upon.

    The other consequence of the more granular ‘chunkification’ of information in web content these days, is that there is no reason why a figure or table number is required, when there is a heading closely above it. How many web sites provide a table of figures or tables to use that number?

  • Leave a Reply to Steve Faulkner

    Some HTML is ok

    You can use these tags:
    <a href="" title="">
    <abbr title="">
    <b>
    <blockquote cite="">
    <cite>
    <del datetime="">
    <em>
    <i>
    <q cite="">
    <strong>

    You can also use <code>, and remember to use &lt; and &gt; for brackets.