Your Questions #13

by .

Doctor treating a patient illustration The clinic is getting busy with more HTML5 ailments. This week, we’ll cover server-side validation, immutable images with <canvas>, retrieving drawn objects from a <canvas>, creating custom tags, the role attribute, and the effects of <hgroup> on SEO.

Server-side validation

Brian asked:

We currently use PHP-Tidy to validate the HTML mark-up of content regions of our site. Our content regions have images that represent modules in our system that get translated into code for things like processing video, forms, etc. I’ve been making the move to upgrade these modules to use new HTML5 elements, but I’m finding that Tidy is stripping out these elements and I’m doing a lot of dancing around these issues right now. Are you aware of any server-side processing scripts that have been updated to work with HTML5 and perhaps with ARIA support as well?

The current contender is html5lib. The PHP version is v0.1, so YMMV.

It seems that HTML5 support won’t be coming to Tidy anytime soon.

Good luck! Peace – Oli

Immutable image with <canvas>

Mike asked:

We do have a medical app and we want to use <canvas> to let docs draw over a still picture. The only thing I cannot do is keep the picture unchanged when they use an “eraser” (basically a white pen). I haven’t found any example on the web so may be it’s impossible. It seems that the background image either part of the canvas of a div containing the canvas becomes part of the canvas itself. If it’s possible what’s the secret? Thanks

This is possible in a variety of ways. Here’s a quick demonstration of one of the solutions. Open the demo up in Firefox 3.6 or later, drop an image in the box, draw over it, and click the “Drawn image as PNG” button to retrieve what you drew (without the background image).

Here’s what’s happening: the canvas is sitting inside of a <div> containing my still image, but I’m drawing on the nested <canvas> (I think you had it the other way around, which was causing your problems).

Hope that helps, Remy

Retrieving objects from <canvas>

DJ asked:

I wanted to know if there is any way in which the drawn objects say rectangle, circle, line, … within canvas can be identified based on the selection at later point in time (after they are drawn).

If there are no direct APIs how could we achieve it. Do we have to store the co-ordinates of each of the created object within and do the object identification based on mouse cursor? Thanks

Unfortunately, there’s no way to retrieve these objects unless you write your own system to handle it. There’s no native support for this in <canvas>.

It sounds like you actually need to use SVG, which will allow to you to hook event listeners and query the DOM tree that’s created. If you need some convincing that SVG is the right tool for the job, have a look at the Raphaël JavaScript Library. It uses SVG exclusively and is able to create some very impressive drawings and animations.

Cheers, Remy

Custom tags

Mike asked:

So by using some JavaScript I can insert unfamiliar tags into IE, and using CSS I can format them. My question is why stop with HTML5 tags like section and nav? What are the pros and cons of custom tags like <content>, <story>, <comment>, <blog> or even <mike>, <was>, <here>?

Custom elements go against having a standard like HTML5. Standards map out the set of elements, attributes, and APIs that the browsers need to implement so web developers can use them, and they provide those developers with a common approach to marking up a web page.

If custom elements were allowed, we would have an infinite number of ways to mark up content, many of which would share a common goal but require different implementations. As an example, here is a number of different elements I can dream up for some primary content: <article>, <blog>, <entry>, <post>, <page>, <main>, <primary>, <content>, <document>, <doc>, <blogpost>, <publication>, <thenameofthearticlewithnospacesorpunctuation>, <item>, <block>, <blob>, <text>, <txt>, and <paper>. Many of them are bad ideas, but hopefully you see my point. This doesn’t even account for the more predictable <contentone>, <contenttwo>, <articlefifty> that would likely also be used.

This sort of markup would make HTML a nightmare to maintain. A developer coming into an existing site would have to learn which elements have been used and what their purpose is. And it’s not just painful for developers. Browser vendors would have to find ways to parse these elements and define how they should be used. Is this element supposed to be block-level? Is it interactive? Should it impact the document outline? And what about search engines? How do they know that <myobscureelement> defines the most important content on the page, the content that should really be indexed?

Standards narrow the possibilities and ensure developers, browsers, and machines (search engines and the like) are all speaking the same language. Many people spend a great deal of time debating the specification, trying to reach consensus on which proposals should be standardized and how they should be implemented.

So stick to the standards! They exist for everyone’s benefit. As browsers continue to implement the specification correctly (even IE is catching up), our jobs will be made easier and we can spend more time creating really cool things ;)

John Alsopp’s fantastic article Semantics in HTML5 goes into more detail about this issue. You can also see where some of the element names came from by looking at the work Hixie did researching class names in Google’s index.

Regards, Mike

The role attribute, SEO, and <hgroup>

Robson asked:

What about the role attribute? Was it dropped from specification? What will be the “role” of the role attribute in HTML5?

Today, just the home page should have the name of the site into a H1 element. On others pages, the H1 should be used to the title of the articles. How do search engines interpret the HGROUP and multiples HEADER and H1 elements today? How to implement the HGROUP and the HEADER today without impact the SEO? Thanks, guys!

To answer your first three questions, role is in. You can use it belt-and-suspenders style until assistive technology catches up with HTML5. Just be careful: “Authors may use the ARIA role and aria-* attributes on HTML elements, in accordance with the requirements described in the ARIA specifications, except where these conflict with the strong native semantics described below”.

For the second part of your question, that’s not true. You can use more than one <h1> in HTML 4/XHTML 1. It’s not advised to make every heading <h1> in HTML 4/XHTML 1 (because historically some people did that to spam), but it may be appropriate in some cases — e.g., site name and page title. Using two <h1>‘s on a single page has no effect on SEO.

With regard to <hgroup> and <header>, you’re asking the wrong question. Search engines care about high-quality, relevant content. Search engines penalise spamming, but writing markup according to the specification is not spamming. The HTML5 editor works at Google, so they’re well aware of the spec. html5doctor.com has implemented <hgroup> and <header>, and it hasn’t hurt our search engine rankings any ;-)

For more, see our articles on the header element and the hgroup element.

You probably don’t want to use <h1> everywhere anyhow, as CSS selectors are not that smart. If you wrap every <h1><h6> in a sectioning element (<section>, <article>, <nav>, <aside>), you don’t have to worry about keeping a logical order for your headings. Doing this means you don’t need to overwrite CSS as much. The old style, however, with the requirement to keep a logical order for your headings, still works.

Again, you’re concerned with the wrong thing. Good SEO = good content. Worrying about placement or what search engines think is a waste of time — worry about good content.

Peace – Oli

Got a question for us?

That wraps up this round of questions. If you’ve got a query about the HTML5 spec or how to implement it, you can get in touch with us and we’ll do our best to help.

7 Responses on the article “Your Questions #13”

  • Shelley says:

    One of the real problems with using “custom” HTML elements, without the use of namespaces, is that there could be name collision at some point in time. If a person uses a reasonable element name such as “content” today, and “content” is added in HTML6, then their use of content could conflict with the new standard use of the element “content”.

  • Felix says:

    I think the problem with the “Immutable image” is because Mike wants to erase a part of the canvas, but instead he just draws with white color. White!=transparent! You maybe want something like clearRect()

  • Brian LePore says:

    Wow, I asked that a while ago …

    I just wanted to follow this up by bringing up that you can add HTML5 elements to Tidy using the new-blocklevel-tags and new-empty-tags configuration settings. They do not enforce things like the source element only appearing inside of video and audio tags, but if your output is already valid it will keep them from being stripped and keep the markup as-is.

  • @Shelley, good point on custom elements.

    @Felix thanks for that.

    @Brian, as you can imagine we’ve got quite a backlog! Thanks for the update.

  • Regarding H1-H6, my rule of thumb is only one h1 on a page… usually the same with H2 if there’s a sub-header… ex: h1 = page title, h2 = site title… then h3 for section headings, and h4-h6 for subsection headings. It’s just cleaner imho. Though I’ve also been limiting my use to [article] [hgroup] [h3] section [/h3] [/hgroup] [details] … [/details] [/article] as a general pattern… Not a rule perse but a reasonable organizational pattern for content.

  • Brad says:

    Hi,
    The problem I am having is the following:
    I use a video and a canvas tag and capture a frame from the video tag and place it into the canvas tag, but when I create a reference of the canvas in javascript so that I may return the dataURL it bombs out.
    Here is the code I use to place the frame from the video:
    _________________________________________________________________
    var video = document.getElementById(“video”);

    var canvasDraw = document.getElementById(‘imageView’);
    var w = canvasDraw.width;
    var h = canvasDraw.height;
    var ctxDraw = canvasDraw.getContext(’2d’);

    ctxDraw.clearRect(0, 0, w, h);
    ctxDraw.clearRect(0, 0, w, h);
    ctxDraw.drawImage(video, 0, 0, w, h);
    _________________________________________________________________
    The above works perfectly.
    Below is the code to get the dataURL:
    _________________________________________________________________
    function getURIformcanvas() {
    var ImageURItoShow = “”;
    var canvasFromVideo = document.getElementById(“imageView”);
    if (canvasFromVideo.getContext) {
    var ctx = canvasFromVideo.getContext(“2d”); // Get the context for the canvas.

    var ImageURItoShow = canvasFromVideo.toDataURL(“image/png”); //<– It fails on this line.
    }
    var doc = document.getElementById("txtUriShow");
    doc.value = ImageURItoShow;

    }
    _______________________________________________________________
    It always fails on the line:
    var ImageURItoShow = canvasFromVideo.toDataURL("image/png");

    Any thought on what might be the problem. If I load a normal image into the canvas it works fine, but as soon as I load the image from video into the canvas that line fails.

    Any ideas?

  • Brad says:

    Hi,
    Great stuff on this site with regards to HTML5 :)

    The problem I am having is the following:
    I use a video and a canvas tag and capture a frame from the video tag and place it into the canvas tag, but when I create a reference of the canvas in javascript so that I may return the dataURL it bombs out.
    Here is the code I use to place the frame from the video:
    _________________________________________________________________
    var video = document.getElementById(“video”);

    var canvasDraw = document.getElementById(‘imageView’);
    var w = canvasDraw.width;
    var h = canvasDraw.height;
    var ctxDraw = canvasDraw.getContext(’2d’);

    ctxDraw.clearRect(0, 0, w, h);
    ctxDraw.clearRect(0, 0, w, h);
    ctxDraw.drawImage(video, 0, 0, w, h);
    _________________________________________________________________
    The above works perfectly.
    Below is the code to get the dataURL:
    _________________________________________________________________
    function getURIformcanvas() {
    var ImageURItoShow = “”;
    var canvasFromVideo = document.getElementById(“imageView”);
    if (canvasFromVideo.getContext) {
    var ctx = canvasFromVideo.getContext(“2d”); // Get the context for the canvas.

    var ImageURItoShow = canvasFromVideo.toDataURL(“image/png”); //<– It fails on this line.
    }
    var doc = document.getElementById("txtUriShow");
    doc.value = ImageURItoShow;

    }
    _______________________________________________________________
    It always fails on the line:
    var ImageURItoShow = canvasFromVideo.toDataURL("image/png");

    Any thought on what might be the problem. If I load a normal image into the canvas it works fine, but as soon as I load the image from video into the canvas that line fails.

    Any ideas?

  • Leave a Reply to Shelley

    Some HTML is ok

    You can use these tags:
    <a href="" title="">
    <abbr title="">
    <b>
    <blockquote cite="">
    <cite>
    <del datetime="">
    <em>
    <i>
    <q cite="">
    <strong>

    You can also use <code>, and remember to use &lt; and &gt; for brackets.