Computer says NO to HTML5 document outline

by .

What a brilliant idea!

Patrick lauke with red lips and styled hairpink unicorn unleashes H1's for everyone

For the longest time HTML5 specified, and advised developers, that it no longer mattered what the number (1 to 6) was in a heading element (when used in conjunction with sectioning elements). What mattered was the nesting level of the H1-h6 in sectioning elements, just like the X<H>TML promised land, but better as it recycled current heading elements. This concept was embraced by many a web standards afficianado and has been spread far and wide by web standards evangelists, in speeches, articles and books.

How the outline should work: using nested section and h1 elements

<body>
 <h1>top level heading (parent sectioning element is body)</h1>
 <section>
 <h1>2nd level heading (nested within one sectioning element)</h1> 
  <section>
  <h1>3rd level heading (nested within 2 sectioning elements)</h1> 
  </section> 
 </section>
</body>

document outline:

 → top level heading
 → → 2nd level heading
 → → → 3rd level heading

Trouble in outline nerdvana

Document outline semantics exposed by browsers and assistive technology:

→ top level heading
→ top level heading
→ top level heading

Brilliant as it is, this idea as specified has not been taken up by user agents. So after 7 years or more we have a concept without interoperable implementations (super sad face).

For the last few years, the HTML5 specification has included a warning about the lack of implementations and has suggested that the document outline algorithm not be relied upon to convey heading semantics to users. Recently this has been taken a step further. Now the HTML 5.1 specification  requires developers to use h1-h6 to convey document structure. The simple reason for this change is that the HTML5 document outline is not implemented and despite efforts to get it implemented, the general response from user agent developers has not been enthusiastic. You can read the updated advice and requirements in the HTML 5.1 specification

Comments or questions? Bring ’em on!

PS: If you find any bugs in the HTML 5.1 spec you can open an issue or send a  pull request.

Update 21/06/16 – Heading-level outline view

You can now check the heading-level outline of a page using the W3C HTML checker or the W3C markup validation service (same output different UI) with thanks to Mike[TM]Smith. It is provided alongside the structural outline, so you can compare semantic reality and theory.

outline view screenshot

Example of heading-level outline and structural outline displayed by the Nu HTML Checker

32 Responses on the article “Computer says NO to HTML5 document outline”

  • Rob says:

    Sad face. I loved the section headings cause it fell in line with the idea of reusable content. It makes me question what the purpose of sectioning is for at all. It also makes me wonder if I should just use divs and spans for everything.

    • @Rob, section and article still have meaning and their meaning is conveyed to assistive technology users:

      Assistive Technology may convey the semantics of the article to users. This information can provide a hint to users as to the type of content. For example the role of the element, which in this case matches the element name “article”, can be announced by screen reader software when a user navigates to an article element. User Agents may also provide methods to navigate to article elements.

      http://w3c.github.io/html/sections.html#the-article-element

      Likewise for the section element.

    • Patanjali says:

      Sectioning is still important for semantics, as it indicates the contextual scope of a particular block of content, and which headers apply to it.

      While browsers may not support them very deeply, you can use them within your own framework for a site to provide functional semantic support. As tag names, they make for more readable css and javascript querySelector statements than classes.

      Even using them loosely still allows search engines to make some
      reasonable stabs at using structure as part of the 100s of criteria for ranking pages.

      This is separate from non-semantic constructions that would use divs, such as for showing/hiding blocks of alternate content of a section, such as AJAX or plain form versions, depending upon facilities like whether javascript is enabled (forget the extremely limited noscript element).

      Divs and spans provide absolutely nothing on which even rudimentary structure detection tools in current and future browsers could discern content extents and hierarchy.

      This will probably be the biggest legacy of HTML5, though I wish they had standardised on pure XML with graceful browser interpretation of lesser rigours, rather than taking a step back.

      I also wish we had XML/Xpath instead of css, but that is on a hope list. It would certainly make for more flexible selections, especially filtering.

  • Alohci says:

    It’s a great sadness that the browser makers wouldn’t implement it. Had the computed outline levels been exposed through a CSS pseudo-class and a DOM4 element property it would have been of great use. Without those capabilities it was doomed to failure, since the result would have been too unreliable to have only been used by accessibility tools.

    • @Alohci, it is unfortunate. Note that the outline algorithm is still in the spec as in itself it does not have any use agent implementation requirements (maybe why it wasn’t taken seriously) and is implemented in some browser extensions and experimental outlining tools. What I have done is to modify author advice and requirements to reflect what works rather than what we wish would work.

  • Alohci says:

    @Steve – the outline algorithm should go too. As I understand it the rational for retaining it is that if any html consumer wants to create an outline, that’s the algorithm it should use. But it’s a fiction. In practice, pretty much every implementation produces a different result in some scenario. And it’s hardly surprising. A few years ago, I tried implementing it and found it more or less incomprehensible. How the average web author is expected to predict what outline their markup will produce is beyond me,

    And browser makers, will simply wilfully ignore it if, should they want to implement an outline and they discover that they get better results by using some other algorithm. Until there are extant web-pages that rely on the outline algorithm to work correctly, the browser makers will not feel bound by what the spec says.

  • Ron Waldon says:

    With a little effort, it’s probably possible to still just use H1 tags everywhere doing your development / writing, but have a publish / render process that switches them to the tiered H1, H2, H3, etc before the HTML is delivered to a user agent.

  • Ge Ricci says:

    I feel more frustration than sadness. I just can’t understand why something so logical and simple wouldn’t be adopted by web user agents. As others pointed out, becomes meaningless, as it does the hability to “extract” and reuse content from it.

  • Nico says:

    This is very sad and just confirm once again where the logic of mere profit is leading us. So we have 3d shaders but not a way to handle sidenotes or text selection.
    Congrats once again to browser vendors for wasting all of our efforts to produce more meaningful and interoperable documents.

  • R.R. Calbick says:

    So whenever I start writing my HTML, I go through and build the various elements with relevant headings. I then run that code through the following site:

    HTML5 Outliner

    So it has been my understanding that this outline is useful for both assistive technology and SEO purposes. Am I now to understand that outlining is no longer relevant? Is it not still better than nothing at all as far as structure, and definitely as part of a larger, more comprehensive SEO effort?

    • So it has been my understanding that this outline is useful for both assistive technology and SEO purposes.

      Unfortunately your understanding is incorrect. It could be useful if implemented. The outliner tool is just an experimental implementation of how the outline algorithm could work. I am unaware of any search engine that makes use of the HTML5 document outline algorithm and no assitive tech makes use of it as it is not exposed via browser accessibility APIs.

      Creating a document outline is useful for users, it requires using h1-h6 to do so.

    • R.R. Calbick says:

      Well this definitely seems unfortunate. Google recommends using them to assist users, as you mentioned, and to make your content more structured, illustrating key points and sections if used appropriately. So we will continue using them as we have, in conjunction with everything else (proper use of HTML tags, schema.org, etc.).

      But in all honesty, I’m somewhat hazy on how sectioning and/or heading tags should be used instead, as has been implied or suggested…could you clarify this please? Maybe I’m simply not fully understanding the implications of what you’ve stated in the article.

      • This WCAG 2.0 technique may be helpful: Using h1-h6 to identify headings. If you want to use h1-h6 in conjunction with sectioning content, ensure that the rank of the heading matches the sectioning element nesting level:

        <body>
        <h1>top level heading</h1>
         <section><h2>2nd level heading</h2>
          <section><h3>3nd level heading</h3>
           <section><h4>4th level heading</h4>
            <section><h5>5th level heading</h5>
             <section><h6>6th level heading</h6>
             </section>
            </section>
         </section>
        </section>
        </section>
        </body>
        
  • Kraig Walker says:

    It’s kinda funny how implementers basically managed to push back on something agreed to be a “standard,” almost as if the UN member states chose to veto a motion by failing to turn up to the meeting…

    But another thing I’ve been aware of has been that recycling header levels in a document can make things a bit more challenging for screen reader and keyboard users, so I still stick to some loose rules around using just one h1 on the document for the beginning of the most important stuff.

    • What happened in this case was that no normative requirements made it into HTML5 as user agent implementers were not interested in implementing those requirements. The outline algorithm was in HTML5 (and is still in HTML 5.1), but did/does not include any normative implementation requirements, so implementers are not vetoing anything in the “standard”.

  • Ahmad Alfy says:

    This is really sad :( I got used to the new sectioning elements it saved me a lot of confusion and it was easier to understand. I teach Front-end development at a university and my students picked it up quickly. Abandoning document outline is really terrible

  • Christian says:

    I had pointed out the problems with the “nested section heading” scenario a long time ago. The whole thing only came about because of people not knowing how to use regular HTML headings in the first place. Either they didn’t know how to count (e.g., jumping from H1 to H4 with no H2 or H3 between them), or they were desperately hoping to be allowed by some web standards entity something or other out there to be able to make EVERY heading in their page an H1 element so that it would “increase their Google juice” or whatever. So when developers were told a few years back to “just put a new H1 every time you have a new SECTION” (and were also told at one point to start a new SECTION every time you had a new H*) a lot of developers cried Hallelujah! But then strangely people didn’t implement it a bunch. So… hmm… strange situation.

    “Now the HTML 5.1 specification requires developers to use h1-h6 to convey document structure.” This is how things were always supposed to be… from the very inception of HTML. Going back to the simple purity of HTML will solve a lot of problems. I dream of a day when developers will know to use a H2 and then an H3 before using an H4. We’re supposed to intellectual, smart computer geeks, but for some reason a lot of us can’t even do simple counting.

  • Jochem says:

    Hi,

    Maybe its too early in the morning and things haven’t sinked in yet properly, but if I skim through the specs, sectioned headings are still valid in the new spec no?

    Example 32 shows that sectioned headings should still create the same outline as what we’re used to with nested headings.

    For SEO purposes its still recommended and reading this part:

    user agents are also encouraged to offer a mode that navigates the page using heading content alone.

    Makes it still important to use sectioned headings so the navigation happens according to the webmasters intention.

  • Bill says:

    Seems like an alternate implementation could be sectionheader or articleheader.

  • I’m a journalist turned product owner who’s trying to learn programming, markup and other interesting things, and for the last couple of weeks I have been devouring everything I’ve been able to come across regarding HTML5 and CSS3.

    I adopted the concept of document outline and the nesting level in sectioning elements relatively quickly. The reason for that is quite simple:

    It makes sense.

    (Super sad Puss in Boots face)

    Well oh well. Hopefully I can become a Markup Master anyway, somehow.

  • F*** them — I’m doing it anyway. When you have reusable components with no understanding of its parent structure, outline semantics are a necessity.

    • Well Steve, what we do know is that anything nested in a sectioning element is going to represent at least a h2 as only headings scoped to body are equivalent to h1, so if you are advocating fucking users at least fuck them with a h2.

  • Mark Simon says:

    This a seriously missed opportunity.

    The whole point of the outlining concept, as far as I can discern, is to allow, for example, an article element to nestle inside a section element without having to re-number the heading level.

    I think the real problem is that consensus my not have been as clear as it seemed from the outside. After all, it’s not as if the browser manufacturers were left out of the picture. Many of actual the specifications are a bit fuzzy, and appear the be the result of compromise rather that consensus.

    However, I thought that some browsers were outline-aware. Firefox, for example, will show article>h1 elements in a smaller font, indicating that it knows about the outline.

    The point is, is it no longer correct to begin an article (or aside etc) with an h1 element?

    • This a seriously missed opportunity.

      I think we all pretty much agree that in theory the outline algorithm is a good thing. But it has been damaging in practice as it has been evangelized before its time (if there ever will be a time).

      However, I thought that some browsers were outline-aware. Firefox, for example, will show article>h1 elements in a smaller font, indicating that it knows about the outline.

      Some browsers implement CSS rendering of nested h1‘s as per the algorithm, but thats it. It’s brittle. In this demo the outline algorithm dictates that the 2 headings (h1 and h6) in the article elements both have a rank of 2 (as they are nested 1 deep) and therefore should be displayed with the same size to reflect this. But they aren’t… Furthermore size does not convey the semantics robustly at all.

      The point is, is it no longer correct to begin an article (or aside etc) with an h1 element?

      A heading in an body>h1>article element has a rank of 2 so should be a h2. The only caveat to this is when you have one or more article elements on a page that has no h1 scoped to body body>article

      • Mark Simon says:

        I may have misunderstood the intent of outlining.

        I thought that while a nested h1 would be ranked lower, there is no suggestion that a nested h6 would be ranked higher in the absence of higher level headings. The observed behaviour of the major browsers (Firefox, Chrome and Safara – I haven’t tested this on IE or Edge) would be as expected.

        The same would apply even without using sectioning elements. Beginning your page with an h4 and going down from there was tolerated but never recommended. You should always start with h1 and work downwards from there.

        The notion of the HTML5 outlined applied the same logic to so-called section elements, and you would be expected to start with h1 again. Starting lower would be creating orphans.

        I do think, however, exposing the outline, say as a JavaScript object, would have been helpful.

        • Alohci says:

          “there is no suggestion that a nested h6 would be ranked higher in the absence of higher level headings.”

          And yet that is what the outline algorithm, rightly or wrongly, requires. Which is why the half-arsed styling applied by the user-agent style sheet for h1 elements inside article elements is so misleading and ultimately unhelpful.

        • I may have misunderstood the intent of outlining.

          Yes you have, like many other people.

          A simple illustration of how the outline algorthm works is by running the demo page I provided through the W3C HTML checker- this provides 2 outline views. If you look at the second outline view (the HTML5 outline algorithm implementation) you will see that the 2 headings are on the same level.

  • Mark Simon says:

    OK, I’ve put a lot of thought into this, and I think that the original concept can be rescued.

    To my mind, there are two approaches to implementing a document outline:

    The approach taken by word processors is have a linear document, but to imply the outline from the heading levels. This is also the approach taken in HTML using the h1 … h6 elements, though HTML has not traditionally done anything else with.

    A more structure approach is to explicitly construct the outline using nested structures. This is the approach that HTML5 seeks to introduce.

    The problem arises when both are being used. In principal, you could imply an outline using the heading levels, or create one using nested sections and only h1 headings (creating virtually level-less headings). However, if you start nesting heading levels as well, there is a conflict in interpretation.

    One approach is to disallow heading levels, but, at this point, I think that is an unnatural approach — many people create their document linearly rather than from an outline.

    I think a better approach is to require that all nested sections begin their heading levels from h1 and work downwards. To get to the next level, you either have a lower level heading, or nest a new section element and start again.

    A simple outlining algorithm could be calculated as follows:

    level = heading level + nesting depth

    This would imply that h2 in the body is at the same level as h1 in a nested section, and I think that is intuitive.

    What about starting nested sections with a lower level? Browsers have always forgiven sloppy coding, and I think that this should be an example of such. I don’t think they should be expected to re-number level headings.

    I know this is not currently advised, but I think it would be easy to understand and implement.

  • Join the discussion.

    Some HTML is ok

    You can use these tags:
    <a href="" title="">
    <abbr title="">
    <b>
    <blockquote cite="">
    <cite>
    <del datetime="">
    <em>
    <i>
    <q cite="">
    <strong>

    You can also use <code>, and remember to use &lt; and &gt; for brackets.