The i, b, em, & strong elements

by .

While many HTML4 elements have been brought into HTML5 essentially unchanged, several historically presentational ones have been given semantic meanings.

Let’s look at <i> and <b> and compare them to the semantic stalwarts <em> and <strong>. In summary:

  • <i> — was italic, now for text in an “alternate voice”, such as transliterated foreign words, technical terms, and typographically italicized text (W3C:Markup, WHATWG)
  • <b> — was bold, now for “stylistically offset” text, such as keywords and typographically emboldened text (W3C:Markup, WHATWG)
  • <em> — was emphasis, now for stress emphasis, i.e., something you’d pronounce differently (W3C:Markup, WHATWG)
  • <strong> — was for stronger emphasis, now for strong importance, basically the same thing (stronger emphasis or importance is now indicated by nesting) (W3C:Markup, WHATWG)

Giving presentational elements new semantic meanings

<i> and <b> were HTML4 font style elements and are still used presentationally where appropriate to follow typographic conventions. They now have semantic meaning, however, and their style can be changed via CSS, meaning they’re not only presentational — <b>, for example, doesn’t have to be bold. Because of this, it’s recommended to use classes to indicate meaning to make it easy to change the style later.

The <i> element

The i element represents a span of text in an alternate voice or mood, or otherwise offset from the normal prose in a manner indicating a different quality of text, such as a taxonomic designation, a technical term, an idiomatic phrase from another language, a thought, or a ship name in Western texts.

Other things that are typically italicised include transliterated foreign words (using the attribute lang=""), inline stage directions in a script, some musical notation, and when representing hand-written text inline:

  1. Deckard: Move! Get out of the way!
  2. Deckard fires. Kills Zhora in dramatic slow motion scene.
  3. Deckard: The report would be routine retirement of a replicant which didn’t make me feel any better about shooting a woman in the back. There it was again. Feeling, in myself. For her, for Rachael.
  4. Deckard: Deckard. B-263-54.
Using <i class="voiceover"> to indicate a voiceover (alternate mood)

We ate unagi, aburi-zake, and tako sushi last night, but the toro sushi was all fished out.

Using <i lang="ja-latn"> to indicate a transliterated word from a foreign language (with lang="ja-latn" indicating transliterated Japanese). To check character sets for lang="" values you can use the (ouch), or the excellent Language Subtag Lookup tool by Richard Ishida, W3C.

Nanotyrannus (“dwarf tyrant”) is a genus of tyrannosaurid dinosaur, and is possibly a juvenile specimen of Tyrannosaurus. It is based on CMN 7541, a skull collected in 1942 and described by Charles W. Gilmore described in 1946, who gave it the new species Gorgosaurus lancensis.

Using <i class="taxonomy"> for taxonomic names

Only use <i> when nothing more suitable is available — e.g., <em> for text with stress emphasis, <strong> for text with semantic importance, <cite> for titles in a citation or bibliography, <dfn> for the defining instance of a word, and <var> for mathematical variables. Use CSS instead for italicizing blocks of text, such as asides, verse, and (as used here for W3C specification quote) block quotations. Remember to use the class attribute to identify why the element is being used, making it easy to restyle a particular use. You can target lang in CSS using the attribute selector (eg [lang="ja-latn"]). Full sentences of foreign prose should generally be set in quotes in their own paragraph (or blockquote), and should not use <i> (add the lang attribute to the containing element).

The <b> element

The b element represents a span of text to which attention is being drawn for utilitarian purposes without conveying any extra importance and with no implication of an alternate voice or mood, such as key words in a document abstract, product names in a review, actionable words in interactive text-driven software, or an article lede.

For <b> text that should merely look different, there is no requirement to use font-style: bold; — other styling could include a round-cornered background, larger font size, different color, or formatting such as small caps. For instance, in the script example above, <b class="character"> is used to indicate who’s speaking or narrating.

Text that is bold by typographic convention (and not because it’s more important) could include names in a Hollywood gossip column or the initial text on a complex or traditionally designed page:

Opening versal (drop cap), plus styling of first phrase using b
Connecting the versal (drop cap) with the text using <b class="opening-phrase">. The pseudo-element selector :first-letter is used to create the versal. In this case, the opening phrase is bold only for stylistic reasons, but if it was semantically important, <strong> or some other element would be more appropriate. Note that :first-letter only applies to block-level elements, so the versal “I” is not inside <b>.
Using :first-line instead of the b element
While we can use <b> to apply a traditional typographic style like small-caps to the first word, phrase or sentence, the CSS pseudo-element selector :first-line is more appropriate in this case. For the first paragraph of HTML5Doctor.com articles we use the nifty :first-of-type CSS3 pseudo-class selector.

Only use <b> when there are no other more suitable elements — e.g., <strong> for text with semantic importance, <em> for emphasized text (text with “stress emphasis”), <h1><h6> for titles, and <mark> for highlighted or marked text. Use classes on list items for a tag cloud. To recreate traditional typographic effects, use CSS pseudo-element selectors like :first-line and :first-letter where appropriate. Again, remember to use the class attribute to identify why the element is being used, making it easy to restyle a particular use.

…and for comparison, the <em> and <strong> elements

While <em> and <strong> have remained pretty much the same, there has been a slight realignment in their meanings. In HTML4 they meant ‘emphasis’ and ‘strong emphasis’. Now their meanings have been differentiated into <em> representing stress emphasis (i.e., something you’d pronounce differently), and <strong> representing importance.

The <em> element

The em element represents stress emphasis of its contents.

The ‘stress’ being referred to is linguistic. If spoken, this stress would be emphasised pronunciation on a word that can change the nuance of a sentence. For example, “Call a doctor now!” emphasises the importance of calling a doctor, perhaps in reply to someone asking “Should I get a nurse?” In contrast, “Call a doctor now!” emphasises the importance of calling immediately.

Use <strong> instead to indicate importance and <i> when you want italics without implying emphasis. The level of nesting represents the level of emphasis.

The <strong> element

The strong element represents strong importance for its contents.

Not much more to say really — it’s the <strong> we all know so well. Indicate relative importance by nesting <strong> elements, and use <em> for text with stress emphasis, or <b> for text that is “stylistically offset” or bold without being more important.

In summation…

A final thing to note: these elements (and almost all HTML5 elements) have also been made explicitly media-independent, meaning their semantics are not tied to how they look in a visual browser.

So there you have it — two stray dogs of presentational HTML4 have been transformed into meaningful HTML5 elements, ready to be adopted into your coding once again. Can you resist their semantically shiny puppy-dog eyes? Let us know!

Changes

  1. I’ve updated mentions of using <i> for foreign words to be specifically transliterated foreign words (what I was meaning), based on feedback in the comments. I’ve also updated spec quotes.

82 Responses on the article “The i, b, em, &amp; strong elements”

  • mwiik says:

    Argh, this seems a bad idea. We deal with lots of (usu. MS Word) content with bold and italicized text, and neither the provider nor me is likely to spend time figuring out if b and i tags conform to these semantics. For those page elements where we have control, we happily use strong and em (with their html 4.01 semantics) where such styling is appropriate, but we stick with b and i for provided content since the implied semantics may differ or there may not be any implied semantics at all.

  • Hi mwiik,
    Thanks for your comment. Luckily in your case using <b> and <i> as-is (with no classes) means you’re just using them for typographic effect — no semantics at all. This is the same as HTML4. I think of HTML is a best-effort game, and it sounds like you’re using the appropriate elements when you can.
    peace – oli

  • jacobian says:

    very interesting info about html 5.I’ll to learn and apply it then. :-)

  • Vladislav says:

    What a relief! I was puzzled how to write “R-isomer” or in vito in html5. Now it is clear. Thank you.

  • Daniel Baird says:

    Nice article; welcome to my RSS feed :)

    In the paragraph “Use <strong> instead to indicate importance and <i> when you want italics…” do you perhaps mean to say “use em to indicate emphasis and i when …”?

  • Jimmy says:

    I think this change is going to need a hell of a lot of publicity if it’s going to be even remotely successful, nor do I think it’s a great idea in the first place. The new meanings of <i> and <b> are not intuitive at all. There’s nothing about the letter I that implies “alternative voice” and nothing about the letter B that implies “stylistically offset” to me. Add on years and years of history using those same tags to mean something entirely different and it’s going to be difficult to get people to remember this change, let alone use it as described. The benefit of semantic elements to me is that it’s pretty clear from looking at code what content is what. A lot of that value comes from the intuitive naming of elements. Hence, the new meanings of <i> and <b> cause confusion more than they help mark up content in a clear and useful way.

  • Andrew Vit says:

    While it seems like this is just meant to clear up and refine what the true meanings of these tags should really be, I worry that it’s way over-specified now, even though most people probably found it confusing already.

    When’s the last time you cared about whether or why something is “strong” or “emphasized”? Bold and italic are usually pretty clear in what they mean in their context, and when you need to specify further, there’s always the class attribute. I’m starting to wonder why the separate semantic elements? Are they really justified when we use classes on them anyway? Why not just <b class="important"> when such specificity is needed?

    I think this is getting too academic for 90% of HTML authors, when class seems perfectly usable as an enhanced specifier. <b> or <i> provide the fallback user-agent rendering, and as for semantics, are importance or emphasis actually semantic things?

    There are other elements that could use more spec love, I wonder why something this basic needs to be so baroque!

  • Andrew Vit says:

    A few more gripes:

    <dfn> and <i class="taxonomy"> also seem redundant. Or aren’t they? What’s the difference?

    Why is :first-letter only applicable to block elements? Another arbitrary restriction?

    these elements (and almost all HTML5 elements) have also been made explicitly media-independent, meaning their semantics are not tied to how they look in a visual browser.

    Can you elaborate on what that implies? I’m reading this to mean that user agents are free to render <strong>, <b>, <i> and <em> as they choose, so we should start specifying these elements in our stylesheets?

  • Alohci says:

    @Oli

    Luckily in your case using <b> and <i> as-is (with no classes) means you’re just using them for typographic effect — no semantics at all

    I’m sorry but I can see nothing in the spec to justify that assertion. It’s seems clear to me that valid use of <i> does not permit typographic effect on the whim of the author, only where the typical typographic effect is italics. It’s true that the spec encourages further differentiation by use of class name, but class names, in the absence of microformats, are private semantics, not public ones, and therefore are of limited use. In particular, class names are likely to be written in the language of the author.

    In general, it’s hard to work out what use the new semantics can be put, since HTML5 documents are officially indistinguishable from HTML4 ones, (i.e. HTML5 does not specify any versioning mechanism). So any processor must assume that for any given document, either HTML4 or HTML5 semantics may apply, and that therefore for each element, the meaning can only be resolved to the union of the HTML4 and HTML5 meanings. Since for the elements of this article, the HTML5 semantics seem to be a narrowing of the HTML4 ones, the union of the two is the same as the HTML4 semantics. Note that this does not apply to the <cite> element, where HTML5 permits usage that HTML4 did not.

  • Thank you all for your comments!

    @Vladislav — thank you!

    @Daniel Baird — Nope :) Those are notes on when elements other than <em> would be more appropriate.

    @Jimmy — The “typographically italicized text” and “typographically emboldened text” meanings have not changed at all compared to HTML4. The additional semantics give us some preset ways of marking up content, and in general map to how <i> and <b> are used. The somewhat confusing terminology is part of making these media independent (what does “italic” sound like in a speech reader?).

    Finally remember that intuitive naming only applies if you speak English — <b> has zero connection to 太字 (futoji, the Japanese equivalent). You’ll be relieved to know Japanese web developers are not proposing to replace <b> with the <f> element ;-)

    @Andrew Vit — over-specified? [blink] Well I guess it does take all the fun out of semantic debates ;-) In reality many of the new HTML5 elements are like this, eg <div class="nav"> vs <nav>.

    The benefits of using specific elements over classes are greater uniformity (class is freeform and generally depends on the author’s language, validators will catch typos on elements but ignore classes), and because we can do things based on agreed semantics. It’s arguable whether we really need <strong> in addition to <b>, but that’s what history has given us.

    By “there are other elements that could use more spec love” do you mean at HTML5Doctor? If so please let us know what you’d like us to cover! If you mean in the spec, I’d say everything has received a lot of love ;-)

    <dfn> indicates the defining instance of a term — but none of the taxonomy terms are being defined (there’s no title on nanotyrannus) as I’m no palaentologist.

    Re: :first-line, I’m terribly sorry to mislead you but I mistook a browser bug for bad coding on my part (see below). I’ve removed that text from the article.

    Finally by media-independent I mean they’re not just defined based on how they look, so that they still have meaning in non-visual user agents.

  • @Alohci I’d say that text that has been italicised in MS Word is by default prose whose typographic presentation is italicised. I agree there are cases another element may be more appropriate, but it’s representing italic text.

    I agree with you that the new semantics aren’t particularly useful for user agents. However the additional meaning provided by class names will help e.g. in restyling and in identifying all elements of a particular class (for someone inheriting the code etc). It would also be possible to pull data out of a site if this was used consistently (ref: @mpilgrim’s use of <cite> back in the day)

  • Andrew Vit says:

    @oli, thanks for the response. Au contraire ←(lang=”fr”), I’m not trying to dismiss the semantic debate, I’m all for it: I just think it’s very muddled in this area. Also, I’m not questioning your interpretation of the HTML5 spec — your article is outstanding — I’m really questioning the reasoning behind the spec itself.

    I don’t think “over-specified” is quite the word I’m looking for, but it’s like these overlapping tags are begging to justify their existence by adding more specs to prop them up.

    Here we have 4 generic tags that essentially mean bold or italic on the surface, but can mean any number of things underneath. Since these styles are conventionally applied to such a variety of content, the spec has to be burdened with arcane rules to explain when to use which one, and the author is burdened with unnecessary (academic or arbitrary) choices. Note that the “author” can be software, so these rules can never be adequate. It’s impossible nail down something that’s wide open for interpretation when all we really need to distinguish is bold/italic, and whatever semantic meaning it has can be done by other means.

    I understand and embrace the motives behind semantic elements. I just can’t imagine a scenario where your “agreed semantics” for a generic bold/strong or italic/emphasized element would have any useful meaning by itself.

    Let’s say you want to collect the japanese sushi names, or taxonomies from a science article. You can’t rely on <i> alone, you need those class/lang attributes for additional meaning. <em> and <strong> are no different in this regard.

    As for the semantic meaning of <em> and <strong>, is there any practical boundary between them, independent of how they’re rendered? Whether you call something “important” or “emphasized” seems like hair-splitting. (Note how the spec authors themselves are just swapping words: “stronger emphasis” is now “strong importance”!?) The only practical consideration here is to choose the one that renders logically, whether in a visual or aural user agent, because otherwise they seem equivalent.

    (In aural user agents, is there a difference between “alternate voice” and “stress emphasis”, which are the touted differences between code><i> and <em>? Italics cover both of these in print, wouldn’t the result be the same thing in speech as well?)

    Also, any text, or block element, or group of elements could be considered “important”. An inline element to mark importance doesn’t really fit the pattern for anything except text. (A paragraph, a list, or a form element could be considered important; headings are block elements with implicit importance.) I would derive that importance is really an inherent attribute, not a separate element itself. And similar to the point above, is generic importance semantically meaningful by itself, outside of any scope? Why not just have a class for it instead?

    many of the new HTML5 elements are like this, eg <div class="nav"> vs <nav>

    Why not <voiceover>, <taxo title="common house fly">, <idiom lang="jp"> in that case? Obviously, because there would be hundreds of these…

    Yes, more of these elements are wonderful when there’s an obvious need, and I appreciate them. But for bold/italic passages, the uses are too varied and generic that the argument is against em/strong as truly “semantic” tags.

  • John Faulds says:

    I’ve already been using <i> in the way that it’s now been redefined by HTML5 so I’m happy that the usage has now been formalised. I’d never really found a good use for <b> though so I think what’s being proposed in the spec is a good thing and will lead to me finding more uses for it in future.

  • UltraBob says:

    Great article, but a question unrelated to the core content:

    :first-letter is restricted to being applied to block elements, but surely with a p:first-letter {} style and HTML like

    <p><b>It should come as no surprise</b> that …<p>

    ‘I’ would still be the first letter in the paragraph.

  • UltraBob says:

    Obviously the second p tag should have a /. Typing HTML entities on an iPhone sucks.

  • HrvojeKC says:

    I have a problem with this:
    The level of nesting represents the level of emphasis.

    Does this mean that for wery strog emphasis you would writte something like this?




    Help!


    If so, then it’s not a good specification.

  • HrvojeKC says:

    damn, I forgot to change the code, here it is:

    <strong>

    <strong>

    <strong>Help!</strong>

    </strong>

    </strong>

  • @Andrew Vit — I agree we probably would have been fine with just <b> and <i>, but history has given HTML5 four elements to define. The new definitions make their use more clear-cut than HTML4, and I personally see a difference between verbally stressing something (which I’ve done a lot in my comments) and marking it as important (which I’ve done in the article). I’d advise checking the WHATWG spec which goes into a little more detail, but I think this is just a byproduct of not being able to make a clean break, both in having four elements and in using elements over say role="".

    Most authors can (and probably will) ignore these new definitions, but that’s fine too — they’re backwards-compatible with HTML4 and modeled on common use.

    @John Faulds — <b> as a keyword was pretty hard to do examples for ;-) Google cache search results were using <b><mark>term</mark></b> (<b><b>term</b></b> for me now), but I can’t really use those as examples.

    @UltraBob — thank you for that! I’m embarrassed to say when building the sample page I came across this and never did an alternate browser check. I thought it was a mistake I’d made, but it’s a bug in Google Chrome instead. I’ve updated the article, regret the error, and will submit a bug report for my penance.

    @HrvojeKC — as it’s relative importance you’d only need that nesting if you actually had three levels of importance ;-) Now that tag clouds are explicitly defined as lists I think it’s unlikely you’d need more than two levels in practice.

  • zcorpan says:

    The glossary should have prognosis links to this article.

  • Simon, you’re right, they should be there now.

    Rich

  • […] The i, b, em, & strong elements | HTML5 Doctor […]

  • Mike Wyatt says:

    but acronym and abbr were too confusing ..

  • @Mike Wyatt — umm, you’ve lost me there :) For the record <acronym> has been cut and only <abbr> remains in HTML5/

  • Mike Wyatt says:

    that’s exactly the joke

    the W3C threw semantics out the window when they remove the acronym tag from the spec, now this?

  • @Mike Wyatt — surely they threw semantics out the window way earlier when they added all these presentational elements to the spec? ;-) AFAIK there was no difference in rendering or AT use of the two elements (despite the intent), so basically they were duplicates. I hope we get a way to distinguish initialisms and acronyms sometime though.

  • Steven Black says:

    I predict that no end of confusion will come of this.

    What’s worse: there’s no way out. Once semantics are re-defined you’re stuck. The ensuing confusion is permanent.

  • Great article, thanks.

    I have a travel blog and as such I was using <span lang=”nl” title=”something”>iets</span> for words and phrases in another language, but now I will start using <i> for it, although it’s going to take me a while I think, I’m very used to think of <i> and <b> as bad practices tags.

  • Joe says:

    I see this might be more of an accessibility issue. I’m thinking screen readers… <i>, <b>, etc can all help readers present content, verbally, to users with disabilities. You might now want <em> to be spoken differently, such as for an album title or a song title, but emphasis on a phrase or expression, <i> would work wonderfully.

  • Joe says:

    I fail at the internet.

  • @Steve Black — so business as usual then? :D <b> and <i> had no semantics, and <em> and <strong> are more realigned than redefined. I don’t think this will break the interwebs.

    @Joe — oh no, you win the internet!

  • […] The i, b, em, strong elements tulisan Oli Studholme di HTML5Doctor.com (2010). Do not share on: Del.icio.us, Digg, Twitter! […]

  • The unstoppable force that is Richard Ishida has written an article on <b> and <i>. He notes that one reason it’s important to use classes is for localisation, as foreign languages (such as Japanese) may not use bold and italic styling. If you’ve used <b> and <i> (and other elements) semantically, with meaning-based class names, it will be much easier when localising to restyle these phrases appropriately.

  • Geeves says:

    Thank goodness someone wrote this article. It annoys the crap out of me, especially when WYSIWYGs, show ‘B’ buttons in their UIs and insert a <strong> tag. I’m so happy these are given appropriate semantic definitions now.

  • Sasha says:

    I look forward to learning about how different user agents, including screen readers, render these elements in practice over the next couple years. That’ll have a much greater impact on how I use them than these borderline-impenetrable definitions.
    I’m also wondering at what point HTML5 might begin to feel cluttered with arcane semantic elements that are rarely used in practice, and if this might result in needlessly bloating storage and memory requirements for user agents.

  • […] si tienes un buen markup: diferenciando claramente entre un elemento <b> y <strong> (uno es semántico mientras que el otro denota importancia) y estructurando el documento de tal forma que primero venga un <h1> y finalize en párrafos […]

  • […] no longer valid attribute, must have href. address – new rule when address is in article tag b – stylistically offset em – stress emphasis hr – paragraph-level thematic break i […]

  • […] These tags have all been given semantic meaning, since their use is so prevalent. Thank goodness! The HTML Doctor has a great overview on intended usage. […]

  • Francesco says:

    I may be coming a little late, but I’ve got a question about when to use b and when to use strong.

    In a blog post (typically) to show some words are more important than others should I juse strong or b? Because the importance of the words makes me choose strong, but after all those important words may also be considered keywords… and the spec says that b is for keywords (too). So I’m a bit confused about that.

    Thanks.

  • @Sasha — Browsers will do the same as they’ve always done: e.g. for <strong> and <b> they’ll use font-weight: bold; for both. Not sure what you mean about screen readers rendering things ;p Also I think you mean “arcane semantic elements that I rarely use in practice”, no? ;p

    @Francesco — <b> is for keywords that are not important, so in your case just use <strong>

  • […] example, now might be a good time to address use of the lang attribute. HTML5 Doctor provides an excellent article on the same topics I’m covering […]

  • […] b and i, given semantic meanings to make them media-independent. You can find out some-more here: The i, b, em, and strong elements and The small and hr […]

  • Evert says:

    b — was bold, now for “stylistically offset” text, such as keywords and typographically emboldened text

    Oh lol, the semantic web is getting semantic. “stylistically offset” seems just another word for bold in this case. What is the semantic value of “stylistic offset”? Really, this seems a bit stretching it to me.

  • @Evert —

    “stylistically offset” seems just another word for bold in this case

    which case are you referring to? Not, perhaps, the second example where the text is not bold? I mean, did you even read the article? <_<

  • Evert says:

    @Oli
    I was referring to the <b> element as that is what is defined as ‘stylistically offset’ (in the spec no less). The fact that the example is not bold doesn’t mean that it isn’t just an example of style (bold, italics, caps, etc.) rather than semantics.
    To use b when no other elements are appropriate just makes it a substitute for span.
    I also find your last remark impolite.

  • @Evert — Regarding just another word for bold in this case, the spec for <b> only mentions the word “bold” once (to say <b> does not necessarily mean font-weight: bold;), and none of the spec’s examples display emboldened text. Because of this I leapt to the conclusion that you were referring to this article, which reiterates <b> != font-weight: bold; several times. I’m still unsure where you’re getting the definition of “stylistically offset” being font-weight: bold; from in the spec.

    Your statement To use <b> when no other elements are appropriate just makes it a substitute for span is also incorrect. The same logic applies to most other phrasing elements. As you read in the article, in HTML5 there is some semantic meaning, in addition to presentational meaning. For example, a text-to-speech browser might emphasise <b> differently to <span> with font-weight: bold; associated with it. To reiterate, Only use <b> when there are no other more suitable elements. If you’re faking a ransom note with @font-face, then <span> with font-weight: bold; might indeed be more appropriate.

  • Evert says:

    @Oli: The whole point just became moot as the spec seems to have been changed since I read it last (a month ago?). The b-element now reads:
    “The b element represents a span of text to which attention is being drawn for utilitarian purposes without conveying any extra importance and with no implication of an alternate voice or mood, such as key words in a document abstract, product names in a review, actionable words in interactive text-driven software, or an article lede.”

    Nowhere can I find the words “stylistic offset” any more in that section of the spec.
    But to respond to your comment. I guess I made myself not clear by using the word “bold” where I should have used (generic) “style”. I was just trying to make the point that that *main* description for an element made it non-semantic, despite that fact that examples gave it possible semantic use cases. In other words, even though the element could be used semantically, it was described as non-semantic. But now it is. :-)

  • @Evert — damn that Hixie and his relentless improvements eh :D Also thank you for your explanation. I’d like to change that “stylistically offset” quote you found troublesome to make it clearer (especially as it’s no longer in the spec), but it’s hard to sum up the current definition in a phrase huh. I’ll have a think about it. I agree with you that <b> is not terribly semantic, but I still think keeping its new semantics in mind will help us use it well when appropriate.

  • :O says:

    <i> <b> <em>

    IBM?

    :O

  • We need an article about the recently reintroduced “u” element (and its fuzzy usage…). It’s not even in the HTML5 Doctor’s “Element Index”!

  • Shaun Moss says:

    As a teacher of HTML, I think that the only valid reason for retaining the <b>, <i>, <u> and <s> tags is backwards compatability. To a new student, it’s quite obvious that <b> means "bold", <i> means "italic", <u> means "underline" and <s> means "strike-through". It’s confusing to try and suggest otherwise with complex bandaid definitions.

    Attaching long and complex diatribes about semantics as an attemp to fit past design choices to current requirements really detracts from the simplicity of HTML – and it is meant to be simple, because the web is meant to be as accessible as possible (and that includes web page authoring). In many ways HTML5 is a vast improvement, and an impressive achievement – but these attempts to shoe-horn complex semantic rules into old style tags is a definite mistake.

    The real solution is simple – just say that <b>, <i>, <u> and <s> are style tags, not disimilar from <font>, and therefore obsolete since the advent of CSS – however, they are retained for backwards compatibility, since they are so widely used. Certain semantic tags are rendered italic by default, such as <em>, <address>, <var>, <cite>, etc., but you can create other kinds of italic elements using CSS classes (so, use instead of <i>, which is semantically correct). Same situation applies for bold, underline and strike-through.

    In the case of <u> and <s>, they should have done the same thing as with <b> and <i>, i.e. introduced tags to represent the most common semantic meaning of the tag (e.g. instead of ), and deprecated the old style tags. This would have been more consistent than the current spec.

  • […] importance. More on this (and other similar changes) within the HTML5 spec can be found on the HTML5 Doctor […]

  • Mark says:

    What?

    I’ve read this article a couple times and I still don’t understand.

    If <i> is for non-semantic stylistic or typographic uses, what the hell is <i class=”taxonomy”> and <i lang=”ja-latn”>? That’s exceedingly semantic to me. Wouldn’t <em> be better in those cases?

    (I wish you had a preview or edit on comments.)

  • @Shaun Moss — these elements are indeed still around mainly for backwards compatibility legacy reasons. Given all the content that uses them browsers must support them, even if the spec labelled them obsolete. However I disagree that <em> can replace <i lang="ja-latn">. A screen reader user probably wouldn’t want every Japanese word emphasised. It’s also not obvious to students that “<b> means bold” etc if they don’t speak English, in the same way that “<太字> means bold” is (I’m guessing) not obvious to you ;)

    While the definitions may seem more complex, by being detailed they give us a lot of guidance on when (and when not) to use these elements. This is a good thing IMO.

    @Mark — The class value “taxonomy” in <i class="taxonomy"> has no inherent semantic meaning. The class is only to help you style all instances the same. As the article tried to convey, the semantics associated with <i> are “text in an alternate voice or mood, or otherwise offset from the normal prose” (but not emphasised). The <em> element is for emphasising text, so unless you want your taxonomic references or romanised Japanese words etc emphasised (e.g. read in a different voice by a screen reader) <em> wouldn’t be appropriate.

  • Bert says:

    @Shaun & @Oli:
    Backwards compatibility on a spec is silly. As long as a browser can render an element it can be used and is therefor backwards compatible.
    To say the i, b,etc are there to make HTML5 backwards compatible with HTML4.1 simply is not true. For example; frames have been removed from the spec but browsers can still render them. In fact there was a discussion about that on the list a year ago I think where it was put that just because the spec no longer supports them this doesn’t mean you can no longer use them. Especially not since the version has been removed from the doctype.
    The new doctype just indicates standards-mode and using deprecated stuff (like frames) together with new stuff (like canvas) on the same page does NOT malform the page nor does it invoke quirksmode. The only thing that will happen is that it may no longer validate, but is that really such a big deal? A lot of valid things one can do to a page can make it invalid (using some inline elements as a block for example). As far as I can tell this is intended behavior and is even described in the spec (intended or not) by making sure “errors” in documents do not interfere with rendering.

    Anyway, backwards compatibility of element names in the spec is simply not an issue, neither paper nor browsers care what you name an element, people do. And as such the spec has assumed (wrongly in my opinion) that certain element names should be retained out of historic point of view. Personally I think we just should have been given a generic semantic element (SEM for example) and the ability to make our own elements as needed. Future specs could then have derived new elements from the use of that. The rest of the i, b, u, etc. should just have been deprecated.

  • @Bert — re: “To say the i, b,etc are there to make HTML5 backwards compatible with HTML4.1 simply is not true”, I agree. Their backward compatibility is a side-product. The goal was finding actual semantic use cases for what were solely presentational elements. The reason for doing this is, as you mentioned, they’ll be supported in browsers regardless given legacy content, so we may as well find a use for them if we can.

    I disagree with your suggestion of a <sem> element, as we already have two suitable elements: <div> and <span>. While you’re not alone in believing HTML5 should have been more open-ended, there are already several ways to extend HTML5: classes, microdata, data-*, and RDFa. Common coding patterns have directly led to a bunch of the new HTML5 elements, and will in the future too.

  • […] any semantically meaningful elements in HTML at all if that’s the case. HTML version 5 is redefining some elements to have better semantic meaning because HTML is the language of authors, and to authors and consumers meaning […]

  • Missy says:

    Let’s not forget that these elements have more meaning to assistive technologies than they do for regular visual browsers.

    The differences between i/b and em/strong are quite significant, particular to use who by nature of their disability can only hear the web.

    To deprecate the elements in favour of a more customisable single semantic element would (I think) undermine accessibility.

  • Chris H says:

    Personally, I hate using i and b, and I hate it when I see them used incorrectly in someone else’s source. The HTML5 spec isn’t very unambiguous, so I’m always debating with myself which to use. One week I think b is appropriate, then I think back on it and decide i is better. They should just get rid of them completely and use spans for stylistic offset.

  • Alice Wonder says:

    I like this.
    Currently I use


    <span class="taxonomy">Anaxyrus</span> is from the Greek <span class="greek">άναξ</span> and means sovereign or king.

    My taxonomy class uses blue color with italicized text. My greek class specifies Palatino Linotype followed by DejaVu Serif if Palatino Linotype is not available (Palatino Linotype has the most attractive polytonic Greek IMHO).

    It looks like now I can replace those span’s with i’s – just making sure to update the greek class uses normal face and not italic.

    The difference is largely academic, but it does give a nice unarticulated non-textual annotation. Wait, maybe I should be using u …

    OK – seriously, I think this is silly. It seems to me (as in the case of u) that semantic meanings are being made up to justify the continued presence of these strictly presentation tags.

    No offense intended to Ian, I’m just calling it like I see it. I will continue to use span, because visual markup is exactly what I’m specifying and visual markup belongs in the style sheet.

  • Shaun Moss says:

    “It seems to me (as in the case of u) that semantic meanings are being made up to justify the continued presence of these strictly presentation tags.”

    That’s exactly the case! HTML5 seeks to be backwards compatible (at least partly) so that the spec doesn’t break existing websites. However, it also tries to fit these old tags into the semantic-tags-only paradigm – the result being utterly ridiculous definitions.

    When I teach HTML I don’t use HTML5 definitions. I say b is bold, i is italic, u is underline and s is strikethrough. Seems somehow simpler, yes?

  • @Alice Wonder — Actually <i> would be an inappropriate element for the Greek in your example. It’d be better if you denoted Greek text using a lang attribute. For modern Greek that’d be lang="el". So your code example should read:

    <i class="taxonomy">Anaxyrus</i> is from the Greek <span class="greek" lang="el">άναξ</span> and means sovereign or king.

    I also changed your taxonomy <span> to <i>. Of course it’s no problem if you keep using <span> there, and declare .taxonomy {font-style: italic;} in your CSS.

    @Alice & Shaun —

    It seems to me (as in the case of u) that semantic meanings are being made up to justify the continued presence of these strictly presentation tags.

    While you may regard them as strictly presentational (and if so it’s fine to use something more semantic instead), there are semantic uses that some people are using these elements for, as I covered in But do we need <u>?.

    @Shaun — so I’m guessing you haven’t taught any blind people about HTML yet then? ;)

  • Francesco says:

    @Oli, why would <i> be inappropriate for the Greek text? Can’t it be used for “foreign words”? Thelang attribute is, of course, a good thing, but couldn’t it be applied to <i>?

  • Bert says:

    Besides using the lang attribute for foreign languages, I think in this case you should also use a dfn? Because in essence you are defining (explaining) the meaning of the word.

  • Przemek says:

    @Francesco

    I think Oli meant, that some typographers say that if you have a word in different alphabet (as Cyrillic or Greek) then you don’t have to use extra alternation, as they look differently already.

  • Alice Wonder says:

    Right, when it is a different alphabet it is standard to put them in italic – that’s written presentation though and is media dependent.

    If i really is semantically used for foreign words, then it shouldn’t matter what the character set is, it’s a foreign word.

    So if you are going to justify continuing to use i for foreign words for semantic reasons, then set up your style sheet to not italicize it when it is using a different character set.

    But it looks like the retention of i is based on all kinds of different semantic implications, which to me means it really is not semantic at all.

    em has a clear semantic meaning. I see e, and I know it means emphasis. I see i, what is the semantic meaning? There are several different possibilities, no? Same with u and b. They are presentation tags, they are not semantic tags.

    With respect to the dfn tag, I’m not sure explaining the etymology of a scientific name needs the tag. I’m not sure what value it would add to the content or how it would enrich the content in any media.

  • @Francesco — interesting. I’ve always perceived using <i> for foreign languages to only be appropriate when the language is transliterated. For example:

    <!-- transliterated Japanese (rōmaji): <i> ok -->
    We ate <i lang="ja-latn">unagi</i>…
    <!-- Japanese (hiragana): <i> not ok -->
    We ate <span lang="ja">うなぎ</span>…

    This is in part based on the spec’s “(content whose typical typographic presentation is italicized)”, which I note has been removed since I wrote this (quote updated). You’re definitely right on using lang, but I disagree with using i for all foreign words (including those in non-alphabetic scripts) — it’s not current web usage or traditional print usage, would be a CSS maintenance nightmare, and most importantly is not covered by the definition of <i>. For that use case we already have the lang global attribute.

    The print tradition of italicising transliterated words is a hint to the reader that they’re not expected to be familiar with it and that it’s not a typo. For example the Japanese loan-word idoru (アイドル) is from the English word “idol”, but has a subtly different, culturally specific meaning. Without italicising it a reader might just think it was a typo. As Przemek points out, using a language in a different writing system is enough to “offset” something from normal prose. Non-transliterated instances of foreign languages in English prose should not use <i>.

    The spec specifically says “idiomatic phrases”, meaning phrases like de facto. I’ve filed a bug (“‘an idiomatic phrase from another language’ doesn’t cover non-idiomatic transliterated foreign words”) to clarify the use for transliterated foreign words and phrases. While I’m not backed up by the spec (at least not yet ;), I’ve amended the article so it doesn’t appear to apply to all foreign words (which was never my intention).

  • @Bert — <dfn><i class="taxonomy">Anaxyrus</i></dfn> may be more appropriate if this was the defining instance of this term in the article.

  • Max West says:

    As a web developer I have one thing to say when it comes to the confusing proliferation of text tags above: Not going to waste time on learning them.

  • […] vs. font-style:italic HTML Text Formatting <html> Topic 4 What’s the Difference? The i, b, em, & strong elements Share this:TwitterFacebookLike this:LikeBe the first to like this […]

  • Trukfit says:

    @Dr. Oli Studholme I agree with you.

  • bogdan says:

    I like the title :D “i, b, em”.

  • Sinterbear says:

    As I see it, i, b, and other glyph variants are intended to amplify the meaning of the words they adorne. If you look at their history in printing, I think you can say that, in part, they compensate for the lack of gesture and posture in written communication. As such, their use is highly personal and what they convey is almost completely subjective.

    Although their use has been standardized for very narrow purposes in very narrow domains (Chicago Manual of Style, etc.), I see no advantage whatever to attempting to attach meaning to them at this late stage in the evolution of markup.

    Indeed, doing so is likely to create confusion and misunderstanding when even a carefully crafted style sheet is used by an audience that has no understanding of its origins or intentions. This is typical on the Internet.

    Still, the machinery introduced by HTML5 will have its value as clever people learn how to exploit it to solve technical problems. Personally, I have doubts about the value of device independence. It tends to distract from the larger problem of audience independence, which is far more difficult to achieve even in trivial cases.

  • So basically, still does what does and still does what does – and vice versa.

    To me, b stands for bold and i stands for italic, it’s simple, easy and logical and & takes less time to type over a period of years(!) – so that’s what I’ll continue to use until I find a good reason not to, that is, it breaks!

  • libmult says:

    Great article with useful examples. The new meaning of i was clear to me, but I was wondering what would be a reason for b to exist, as syntaxically different from strong. Your exemple with the paragraph’s first line in bold or in small-caps is probably the best use for a stylistically different text (that can be bold as in the HTML4 spec, or not).

    In my opinion, this level of precision between elements is more useful in big stories with dialogues etc. than in small Web applications, because no one knows how it will be stylized or read in the future: imagine a powerful voice synthesizer for blind people that would handle alternate moods: a text with the correct tags would sound way more natural.

    To those who don’t understand these differences between emphasis, voice difference, alternate mood and so on, I recommend reading or writing theatre/dramatic play’s texts, these kind of elements are plentiful there.

  • Mattia says:

    Thanks for this wonderful article. I wrote something similar for us italian, you can check it on my personal blog! http://www.mattiafrigeri.it/articoli/web/html5-corretto-uso-tag-i-b-em-strong/

  • […] HTML5Doctor, The i, b, em, & strong elements […]

  • Joseph Ocena says:

    Why use element compared to font-weight:bold? Is there any significant reason?

  • C. May says:

    Knock, knock: Real world, here.

    Could someone explain to me how any of this hair-splitting, evolving, tired procedure-over-substance semantics is of any real use to anyone?

    Little of any of this appears to me to be advancing the field. Rather, it smacks of inefficiency. It is akin to sand thrown upon the gears of progress; it even displaces true advancement.

    Personally, I will try my best to ignore it, while seeking to hasten its demise. There are real world issues to solve in the industry, true improvements to be made instead of wasting time of justifying the existence of backward compatible relics. Or does that just require too much assiduity and intellectual discipline?

    Let’s get back to work, shall we?

  • Ibrahim says:

    The button eelment is great, it is fantastic with javascript. It doesn’t quite work in a form though, because Internet Explorer chooses not to follow specification and send the button’s inner HTML instead of it’s value which is a bit annoying.My solution to make it work in forms is not to have it send anything, put the information to be sent into hidden input field. That doesn’t solve the problem if you wanted conditional submit buttons though. So if you want multiple submit buttons with different values, you’d have to stick with input buttons.

  • gotofritz says:

    using <b> and <i> as-is (with no classes) means you’re just using them for typographic effect — no semantics at all.

    This is quoted in an answer by @Dr. Oli Studholme above, but where is that stated in the specs? I don’t see it anywhere. All I see in the specs is that they list the difference between HTML 4 and HTML5 – including, indeed, em, strong, i, b (see https://www.w3.org/TR/html5-diff/#changed-elements ). Someone already asked this but Dr Studholme didn’t answer that question.

    Not only that, but in the specs (although in a non-normative section) it says explicitly

    The majority of presentational features from previous versions of HTML are no longer allowed.[…]It is also worth noting that some elements that were previously presentational have been redefined in this specification to be media-independent: b, i

    That seems to contradict what this article says (as far as the specs go; in the real world we all know it’s different, see icons in the popular Bootstrap framework etc)

  • […] For further reading on this subject, check out HTML5Doctor’s article, The i, b, em &amp strong elements. […]

  • Leave a Reply to Alohci

    Some HTML is ok

    You can use these tags:
    <a href="" title="">
    <abbr title="">
    <b>
    <blockquote cite="">
    <cite>
    <del datetime="">
    <em>
    <i>
    <q cite="">
    <strong>

    You can also use <code>, and remember to use &lt; and &gt; for brackets.