The definitions of the
cite elements in the HTML specification have recently been updated. This article explains what the changes mean for developers.
Russian translation: Сite и blockquote: перезагрузка
blockquote definition updated
blockquoteelement represents content that is quoted from another source, optionally with a citation which must be within a
citeelement, and optionally with in-line changes such as annotations and abbreviations.
Content inside a
blockquoteother than citations and in-line changes must be quoted from another source, whose address, if it has one, may be cited in the
citeattribute. [emphasis mine]
– 4.4.4 the Blockquote element , Faulkner et al. 2016
Note the added emphasis indicated via an inline note using square brackets “[emphasis mine]”
What the changes to blockquote mean for developers
Previously in HTML5 it is was not conforming to include citations within
blockquote elements. Now it is, as long as the citation content is within a
footer element. Citations inside
blockquote elements are a common markup pattern (data indicates approximately 60% of
blockquote elements include citations), the change to the HTML spec acknowledges this and provides semantic mechanisms to differentiate quoted content from citations.
cite used inside a
<blockquote> <p>As my fellow HTML5 Doctor, Oli Studholme has showed, people seldom quote exactly – so sacrosanctity of the quoted text isn’t a useful ideal – and in print etc, citations almost always appear as part of the quotation – it’s highly conventional.</p> <footer> — <cite><a href="http://www.brucelawson.co.uk/2013/on-citing-quotations-again/">Bruce Lawson</a></cite> </footer> </blockquote>
The above example is indicative of what authors are doing anyway, so rather than maintain a theoretical purity which serves no one, the definition has been modified to solve a real problem using existing HTML features, rather than re-inventing the wheel.
An edge case
One of the arguments against allowing use of
blockquote to identify citation content is that the quoted content itself may contain these elements. The simple reason for dismissing this argument is that the vast majority of instances of quotations in
blockquote elements this will not be the case. Denying the utility of
footer for an edge case is another example of theoretical purity that serves minimal practical purpose.
If the case does arise for you, one method the HTML speccurrently suggests is you comment out the elements from the source (This is still an open issue and advice may change):
(added 20/11/13) In response to feedback the suggestion in the HTML spec is to use a
class attribute (a defined extensibility mechanism) on the
cite element to identify it as an inclusion from the quoted source:
<blockquote> <p>My favorite book is <cite class="from-quote">At Swim-Two-Birds</cite></p> <footer>- <cite>Mike[tm]Smith</cite></footer> </blockquote>
cite definition updated
citeelement represents a reference to a creative work. It must include the title of the work or the name of the author (person, people or organization) or an URL reference, which may be in an abbreviated form as per the conventions used for the addition of citation metadata. [emphasis mine]
– 4.51 the Cite element , Berjon et al. 2013
What the changes to cite mean for developers
Previously in HTML5 it is was not conforming to cite an author by name or include other reference information for a creative work in a cite element, the use of cite was reserved (theoretically) to identify the title of a creative work. This was an attempt to rip up a cow path, that authors had created over the last 14 years of the cite elements existence.
Authors railed against the change in definition:
Join me in a campaign of civil disobedience against the unnecessarily restrictive, backwards-incompatible change to the
citeelement. Start using HTML5 but start using it sensibly. Let’s ensure that bad advice remains fictitious.
They also provided use cases and real world examples of citation usage. As a result of research, data analysis and discussion, authors can now, again, use the
cite element to identify a wider range of references to a creative work;
the title of the work or the name of the author (person, people or organization) or an URL reference. What we lose in theoretical purity we gain in utility:
part of the reason why <cite> can now be used with an URL (not to mention @bing‘s identical use. http://t.co/h3qp3ujfzX
— Steve Faulkner (@stevefaulkner) October 22, 2013
What do you think?
Please read the definitions in HTML 5.1, your feedback as always is welcome!
Big up! to Doctor Oli whose research helped mold the changes made to the
blockquote elements. And to Doctor Bruce for insisting that it his right to cite his mum.
36 Responses on the article “cite and blockquote – reloaded”
Thanks Steve. So just to clarify, you can use cite inside a blockquote without the footer element containing it right? Ala…
Here is an example from the HTML spec:
What is the recommended use of the cite attribute on the blockquote tag?
Hi Ted, I suggest that providing a standard link to the quoted content is better for everybody as the
citeattribute content is hidden from most. If you have software that actually makes use of the
citeattribute content you can include it as well.
It is not the first time that I read Steve Faulkner in this blog not acknowledging the fact that there currently two HTML slightly different specifications are being maintained, W3’s and WHATWG’s, and neither that on occasions some changes in one of them are explicitly not adopted by the other (say, hgroup being dropped).
I understand Steve is directly affiliated with the W3 and not the WHATWG, and that disagreements between both groups may carry some feelings of confrontation. But taking into account the didactic nature of this blog, and considering that there is no agreed upon reasoning of why someone not Steve should consider either group’s specification the true single reference, I think obviating altogether the fact that the other specification exists, and what is then their stance on the updates being adopted by the other, actually diminishes the value of this blog. Even if a personal bias is explicitly expressed, doing so gives your readers a better position to reflect upon the values of your writings. Obviating it makes it look as if that is motivated by a personal agenda.
Why there is no comment whatsoever on why developers.whatwg.org still explicitly states that the cite element cannot be used for names? Is it that the new meaning was also accepted by WHATWG but they haven’t updated it yet in the streamlined specification for web developers, or is it that their this change was exclusively breed in the W3 and the WHATWG’s stance has not changed (explicitly declaring so or by omission) and thus there are now two different notions of what cite can be used for?
I talked about the relationship between the W3C HTML and WHATWG specs in this HTML5 doctor interview. These changes are not about W3C/WHATWG politics, they are about improving the definitions of the elements based on use cases, data and research, if you have some technical feedback, please provide it.
I don’t get the commenting out bit. Are you saying that the quoted text of a blockquote should never contain cite elements, or only when there’s cite element that identifies the source of the quoted text? Either way, where’s the semantic value in such a rule? A reader cannot tell whether the cite refers the quoted text or is part of the quoted text.
… I mean, in the latter case a reader can’t tell.
If cite is banned from appearing in quoted text, and web authors comply, then a reader can tell, but at the cost of not citing content that would in any other circumstance be worthy of being cited.
There is no normative requirement in this regard, so nothing is prohibited. The spec states:
“They also Provided use cases”
Like @Alohci, I’m also confuzled by the commented-out cite. User Agents should and do ignore all comments. They’re notes to the developer, and offer nothing to readers.
I understand nobody wants a Cite Inception, and just because something’s referenced within a blockquote doesn’t mean it needs a cite too, but the comment-thing seems… completely pointless from the pov of user agents, and therefore users.
Glad we can mention names/authors again. Almost all e-commerce and marketing pages have quoted “reviews” from users and partners, almost always marked up in blockquotes (it is, indeed, a block quote), with attribution to the user and/or company and sometimes date, NOT some non-existent title. I’m sure someone would disagree and claim a user review is a creative work… but then still without a title, so still useless to cite under the old rules.
This commenting out of cite elements is a real head-scratcher.
The issue of provenance of markup in a
blockquoteis a thorny issue, for simple minded me, I would consider in the vast majority of cases the quoted content to be the quoted text not the markup and that for purposes of working out provenance it should always be considered that the markup is from the current author not the origin of the quote. Others disagree, so there has been discussion around this issue and there is an open bug. If you or anyone can provide improved advice/text, please do.
@bertilo see my response to @stommepoes, suggestions on list, on bug or here welcome.
It seems to me that the wording in the spec should be:
“It is suggested that if the ‘footer’ or ‘cite’ elements are included and these elements are also being used within a ‘blockquote’ to identify citations, ‘i’ elements should be substitued for the ‘cite’ elements from the quoted source, and ‘div’ elements substituted for the ‘footer’ elements from the quoted source. The original mark-up (using ‘cite’ and/or ‘footer’) can be retained in comments.”
The part about “div” for “footer” is my own conjecture.
Have captured your feedback on bug.
@alohci, @stommepoes, @bertilo please have a look at the update in the article (added 6/11/13) in regards to identifying a cite from a quoted source.
class="from-source"is much better, but still not a good solution in my opinion. Such extensions are informal (defined by authors according to the specs), and can’t reliably be used by e.g. Google to distinguish between various kinds of
citeelements (who knows what class names will be used for this?).
I would strongly prefer that citation of the source always be inside a
footerelement. I have not seen any good arguments against that solution. If ever a
footeris part of the actual quote, then perhaps
class="from-source"can be used on such extremely rare
footers, or (probably better), such
footerelements must to be replaced with
class="source-footer"or something similar).
I leave it up to the powers that be to decide if
footershould be outlawed completely, or just strongly discouraged.
I suggest you have a look at the Citation microformat efforts for citation standardization using the class attribute.
The question is does google or any search engine or indeed any software make use of
citeelement semantics? I haven’t seen any public information about this. What I would suggest though if it is the case then identifying this sort of granularity (different types of cite element) would be best formalised in RDFa or microdata or microformats (via the class attribute).
Many academic papers use inline citations, have a look at the MLA guidelines for example.
Inline presentation can be handled by CSS.
How or why would you ask an author to move inline citations out of context when quoted, only to place them back in context visually? for example: this text from A systematic review and meta-analysis of the effectiveness of ICT on literacy learning in English contains a number of in-line references.
That’s much improved. Ideally, it would be better if the semantics for from-quote didn’t favour microformats over microdata or RDFa, or any one those over the others, but is workable as is.
cool :-) I think the method and format will develop itself according to need of authors and software that consumes the info. Will make a note that it could also be achieved through use other metadata formats.
One possibility is to use inline quote tags, where they get their own nested cites whose scope is limited by their parent quotes. This leaves the unnested cite as belonging to the main blockquote.
…but this requires a lot of junk: q’s for other than specifically quoted speech, q’s inside blockquotes, q’s holding cites inside as children… I don’t see it happening. Instead people will just use microformats and its requisite gazillion spans. That’s probably the only sane solution outside creating whole new elements. Or web components.
Pretty much any other solution is better than HTML comments, since HTML comments means “you spent time typing but it’s not actually there”.
Just curious if there is any recommended way to handle blockquotes and a related cite, when the cite is outside (e.g., immediately before or immediately after) the blockquote. For example, “XYZ, provides:” then the blockquote or when the citation for the blockquote is at the beginning of the next paragraph. I see this in legal content all the time and while it seems strange to not tag cites when they exist, it also seems pretty useless to tag them since there is nothing establishing the relationship between the blockquote and its cite.
Any thoughts on what should be done this this type of material?
If I read the changes to the cite element correctly, it must include the author or title of the work. So, are using cite with shortened citation forms, such as “See Ibid 35”, “See also, id at Chap. 10”, invalid? What about shortened forms in which only a part of the author’s name appears, as in “See, Miller, supra, p. 2”?
In highly cited works in which many repeated cites to the same work appear, it is common to give the full cite to a work initially and use these shortened forms to when subsequently citing specific parts of the work. These shortened forms will often be the cite closest to content in a q or blockquote element, so it seems odd they wouldn’t have a valid way to tag them.
The HTML spec says:
This is meant to convey that abbreviated forms such as your examples above are fine, if you think that it needs further clarification please file a bug
Would an endorsement-style quote — one that isn’t quoting from a published source — also fit with the markup patterns discussed in this article?
Specifically, would the byline portion still get wrapped in a cite element?
– Name (Byline)
And is there a reason why the Jeremy Keith quote in the article uses a blockquote with two paragraphs rather than cite, footer, etc.?
@ S. Faulkner
I guess my confusion stems from the fact that the spec requires a title, author, or URL reference (and says these may be abbreviated) but doesn’t seem to specifically address abbreviated citation forms, and especially not abbreviated citations which do not contain a title, author, or URL. (e.g., “See also, id at Chap. 10” where “id” refers to a work previously cited).
I expect that the intent was to cover these forms but I did think some further clarification was needed, so I did file a bug as you suggested. Thanks.
saw bug will process.
Is there any value in including the [footer] tags around the citation. The example provided merely includes the em dash which seems more style than substance?
I’m marking up some testimonials at the moment and I’ve come across two possible structures:
What are your thoughts on the figure version?
@Nick, the non figure markup is simpler. Use of figure/figcaption in your example adds nothing but unnecessary verbosity/complexity.
The value in adding in the footer tags is so one can easily add in the citations. However, it is actually unnecessary for the overall design.
What do you think about enclosing inside a heading?
I have a photoblog and all of the posts (galleries) only include photos without any text and are titled as the work title itself followed by photographer’s name, like so:
GQ Editorial: “Down the River” photo by Bruce Weber
Is this ok?
GQ Editorial: “Down the River” photo by Bruce Weber
Just came accross this: “Although previous versions of HTML implied that the cite element can be used to mark up the name of a person, that usage is no longer considered conforming.”
So I take it we can’t used it for author anymore as mentioned in the examples?
@Bartezz, the document you cited is stale and unmaintained, check the links to the definitions of cite and blockquote in this article for explanations of how these elements can be used.
Join the discussion.