Html5 Tutorial : Structure Of A Web Page

Forums » Bookshelf » Html5 Tutorial » Structure Of A Web Page

3. Structure Of A Web Page

  • Before starting to talk about different HTML5 features, let us start with the basics.
  • In this chapter we are going to discuss:

    • How HTML5 pages are written?
    • What has changed compared to HTML4?
    • What are the new tags and the new attributes?

3.1. HTML5 DOCTYPE

  • The release of IE5 on Mac lead to a major problem

    • Many web content based on bad authoring no longer rendered on IE5 for Mac since this browser respected the standards so much
    • Microsoft later decided to introduce this DOCTYPE declaration to let web developers choose how the web page should be rendered (by activating a certain browser mode: quirks and standards modes)
    • Pages with no DOCTYPE used to be rendered the same way as before (quirks mode)
  • An HTML page first starts with the DOCTYPE declaration

    • This declaration needs to be the very first line that you should find in a HTML page (even a blank line before can prevent it to work)
  • There is only one DOCTYPE in HTML5:

    <!DOCTYPE html>
    • This triggers the standard mode in a browser
    • Only 15 characters, which is way easier than writing something like this:

      <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

3.2. Page Encoding

  • Is the mapping between what you can see on the browser and what is stored in the disk (from 0 and 1 to real characters)
  • Character encoding can be specified in two places:

    • At server level on the HTTP headers
    • In the <head> element of a web page using the <meta> tag

      [Note]Note

      Character encoding declared in the <head> using the <meta> tag takes precedence over the one declared at server level

  • Not specifying a character encoding can lead to serious security issues: http://code.google.com/p/doctype/wiki/ArticleUtf7
  • There is now a short way to declare character encoding:

    ...
    <head>
      ...
      <meta charset="utf-8">
      ...
    </head>
    ...
    [Note]Note

    Character encoding must be specified in the first 512 bytes of the document

3.3. HTML5 Markup

  • Among all HTML5 features, improving the language itself was the top motivation to create HTML5
  • HTML today:

    • HTML has structural AND presentational tags. HTML should only care about structure and not presentation

      [Note]Note

      There are also attributes that have to do with presentation as well. We will later see what does the HTML5 specification recommends about that.

    • As it is today, HTML semantics are fixed not extensible

      • Overuse of id and class attributes ⇒ Pseudo-semantic markup
      [Note]Note

      For instance, writing <div id="header"> is as useful as writing <div id="en_tete"> ("en tete" means "header" in French). How can a browser really recognize a header in a website?

  • Overall, the goal of HTML5 is to reflect the web as it is used nowadays

3.3.1. New And Updated Elements

  • HTML5 introduces 28 new elements: <section>, <article>, <aside>, <hgroup>, <header>, <footer>, <nav>, <figure>, <figcaption>, <video>, <audio>, <source>, <embed>, <mark>, <progress>, <meter>, <time>, <ruby>, <rt>, <rp>, <wbr>, <canvas>, <command>, <details>, <summary>, <datalist>, <keygen> and <output>
  • HTML5 also update some of the previous existing elements to better reflect how they are used on the Web or to make them more useful such as:

    • The <a> element can now also contain flow content instead of just phrasing content
    • The <hr> element is now representing a paragraph-level thematic break
    • The <cite> element only represent the title of a work
    • The <strong> element is now representing importance rather than strong emphasis
    • etc…

3.3.2. Structural Elements

  • We previously discussed that there is currently an overuse of id or class attributes combined with the <div> element to describe the structure of a page.
  • Among all HTML5 new elements, let us discuss the following: <header>, <hgroup>, <nav>, <section>, <article>, <aside> and <footer>.

    • <header>: Used as an introductory element or navigational aids, the <header> can also contain a table of contents, a search form, logos, and navigation blocks (<nav> element).

      [Note]Note

      You can have more than one <header> element per page. This element does not represents the header of the page, you can also use it in <nav>, <section>, <article> and <aside> elements

    • <hgroup>: Represents the heading of a "section". It is used to group a set of <h1>...<h6> elements when the headings contains multiple levels such as sub-headings, alternative titles, or taglines. For the purpose of document summaries or outlines, the text of the <hgroup> element is the text of the highest ranked child element (<h1>...<h6>)

      [Note]Note
      • The <hgroup> element is not an alternative to <header> element since the <header> element does not only contains headings
      • We are going to use an HTML5 outliner (http://gsnedders.html5.org/outliner/) with the following HTML structure:

        <!doctype html>
        ...
        <header>
          <h1>The HTML5 blog</h1> <!-- 1 -->
          <img src="logo.png" alt="logo"/> <!-- 2 -->
        </header>

        1

        This is the title of our header. This is the one to be displayed when outlining our HTML document

        2

        This is for example the logo contained in our header
      • Now let us say we want to add a tagline to our header such as this

        <!doctype html>
        ...
        <header>
          <h1>The HTML5 blog</h1>
          <h2>The coolest HTML5 in the world</h2> <!-- 1 -->
          <img src="logo.png" alt="logo"/>
        </header>

        1

        This is the tagline we have just added
      • Now let us see what happens when producing the outline of the HTML document

        images/structure_of_a_web_page-outline_without_hgroup.png
      • We do not want the tagline to appear in the outline so let us use an <hgroup> element to wrap those headings

        <!doctype html>
        ...
        <header>
          <hgroup> <!-- 1 -->
            <h1>The HTML5 blog</h1>
            <h2>The coolest HTML5 in the world</h2>
          </hgroup>
          <img src="logo.png" alt="logo"/>
        </header>

        1

        Thanks to the <hgroup> element, the only visible heading in the outline will be the top level heading: <h1>The HTML5 blog</h1>
      [Note]Note

      In HTML5, each element can have its own <h1> element. The hierarchy and nesting of each sectioning element determines the heading level of each <h1>

    • <nav>: Represents a major navigation block. It groups links to other pages or to parts of the current page. <nav> does not have to be used in every place you can find links. For instance, footers often contains links to terms of service, copyright page and such, the <footer> element would be sufficient in that case
    • <section>: Represents a generic section of a page. It is a thematic grouping of content with generally a heading. A web site can be split into major sections, such as the news section, the contact section, etc… <section> elements can contain <article> elements
    • <article>: Think as an <article> element as a part of the page that can be independently distributed or reused. This could be a forum post, a magazine or newspaper article, a blog entry, etc…
    • <aside>: Represents content that is tangentially related to the content around the aside element, and which could be considered separate from that content. For instance, if the <aside> element is nested into an <article> element, it could represent a glossary. If the <aside> element is used outside the <article> element, the content should be related to the whole website, such as blogroll or advertising…
    • <footer>: Represents a footer for its nearest ancestor sectioning content or sectioning root element. Like the <header> element, it can be nested in <nav>, <section>, <article> and <aside> elements. For instance, when nested inside <article> elements, it could contain links to other related blog posts, who wrote the article, number of viewers, etc…
    [Note]Note

    The <div> element can still be used. However, use it with caution since it can lead to poor accessibility and poor maintenance. Use it only when no other element is suitable for the job. It is recommended to use <div> for styling purposes or as a convenience for scripting.

3.3.3. New Attributes

  • HTML5 comes also with a lot of new attributes.
  • Most of them will be discussed in the Forms chapter, in the Audio and video chapter and in the Offline application chapter.
  • The following attributes from HTML4 now apply to all elements in HTML5: class, dir, id, lang, style, tabindex and title
  • HTML5 comes also with these new global attributes:

    • contenteditable indicates that an element is an editable area.
    • contextmenu points to a context menu provided by the author.
    • data-* for author-defined attributes.
    • draggable specifies a draggable element in the the drag & drop API.
    • hidden to indicate that an element is not relevant. Browsers should not render elements that are hidden.
    • role and aria-* to allow assistive technologies to deliver the right information to people with disabilities.
    • spellcheck to indicate if content can be checked for spelling or not.

3.3.4. Deprecated Elements And Attributes

  • Some elements and attributes are deprecated by HTML5.

    • It is not recommended for web developers (authors) to use these elements/attributes.
    • It is mandatory for user agents to continue the support of these elements/attributes in order to support existing content.
  • Elements that are deprecated in HTML5 are: <basefont>, <big>, <center>, <font>, <strike>, <tt>, <u>, <frame>, <frameset>, <noframes>, <acronym>, <applet>, <isindex> and <dir>

    [Note]Note

    You can see that most of the deprecated elements such as <big>, <center>, or <font> are only presentational. HTML have to do with structure not presentation since CSS is a better fit for presentation.

    [Note]Note

    Most of the attributes that are now deprecated are also only presentational and are better handled by CSS such as align, background, bgcolor, border, etc…

3.4. HTML5 And CSS3

  • CSS3 is modular and composed of sub-specifications.
  • CSS3 is built upon its previous version but also comes with loads of new features.
  • Among these new exciting features: new selectors, rounded corners, box and text shadows, transitions, animations, transformations, etc…
  • HTML5 is still in development

    • Most of the browsers are fine with elements they do not recognize (except for Internet Explorer but this will be discussed at the end of this chapter) but CSS by default assumes that they are display:inline
    • To be on the safe side until the standard is implemented everywhere, we need to precise the following:

      article, aside, footer, header, hgroup, nav, section {
        display: block;
      }

3.5. Browser Support

  • Good news, the new structural elements we have seen so far work in all browsers except for Internet Explorer.

    [Note]Note

    Internet Explorer has a problem when encountering unknown elements. The DOM we expect from an HTML5 page would be totally different. Moreover, Internet Explorer does not allow CSS styling on unknown elements.

  • A workaround to this problem is to use Remy Sharp’s HTML5-enabling script: http://remysharp.com/2009/01/07/html5-enabling-script/

3.6. Quiz

  1. What is the effect of the following declaration?

    <!DOCTYPE html>
  2. In which situations should you consider using the <div> element?
  3. Why should you always use a character encoding?
  4. HTML5 specification says that you should only have no more than one header and one footer per page:

    • True
    • False
  5. What is the default CSS rule for unknown elements?
  6. Is there any remarks about the following code snippet:

    ...
    <div id="world">
      <article>
        ...
      </article>
    </div>
    <div id="north_america">
      <article>
        ...
      </article>
    </div>
    <div id="europe">
      <article>
        ...
      </article>
    </div>
    <div id="asia">
      <article>
        ...
      </article>
    </div>
    <div id="africa">
      <article>
        ...
      </article>
    </div>
    ...
  7. Do the two following code snippets have the same outline? If yes, what would be that outline?

    <header>
      <hgroup>
        <h1>Marakana Blog</h1>
        <h2>Learn about open source technologies</h2>
      </hgroup>
    </header>
    <section>
      <h1>Programming</h1>
      <section>
        <h1>Android development</h1>
      </section>
      <section>
        <h1>Web development</h1>
      </section>
      ...
    </section>
    <section>
      <h1>Server Administration</h1>
      <section>
        <h1>Apache & Tomcat</h1>
      </section>
      <section>
        <h1>JBoss</h1>
      </section>
      ...
    </section>

    And

    <header>
      <hgroup>
        <h1>Marakana Blog</h1>
        <h2>Learn about open source technologies</h2>
      </hgroup>
    </header>
    <section>
      <h3>Programming</h3>
      <section>
        <h4>Android development</h4>
      </section>
      <section>
        <h4>Web development</h4>
      </section>
      ...
    </section>
    <section>
      <h3>Server Administration</h3>
      <section>
        <h4>Apache & Tomcat</h4>
      </section>
      <section>
        <h4>JBoss</h4>
      </section>
      ...
    </section>