Semantic HTML - Introduction

Chapter 18 25 mins

Learning outcomes:

  1. What is semantic HTML
  2. Benefits of using semantic HTML
  3. Semantic elements in HTML5

Introduction

Right from the very first version of HTML, the markup language has had semantic elements in it to convey meaning of what they mark up.

However, with HTML5 — in particular, with its introduction of a host of new semantic elements — the stress on semantics in HTML became more evident. We now talk about semantic HTML.

In this unit, we shall explore what exactly is meant by semantic HTML, also referred to as semantic markup; why there's so much fuss about it; and finally consider a collection of nine semantic elements introduced in HTML5 throughout the remaining chapters.

What is semantic HTML?

Before we get over to unravelling the meaning of semantic HTML, it's helpful to first understand the meaning of the word 'semantic'.

According to Merriam-Webster:

[Semantic means] of or relating to meaning in language.

In simple words, when we speak of semantic HTML, we are talking about HTML based on its meaning.

The word 'semantic' is not unique to HTML speak; it's prevalent throughout programming languages and even natural languages. Regardless, whenever the term arises, it implies the same thing, i.e. something related to meaning.

As simple as that.

So we now know that semantic HTML talks about the meaning of HTML. But what does this exactly mean itself? Well, let's get into some discussion...

Recall the <h1> element. What does it represent? Or let's put it this way: What does it mean? The <h1> element represents a top-level heading of a document, that is, the main heading. Alright.

Now recall the <p> element. What does it mean? Well, <p> means a paragraph.

Now recall the <ol> and <ul> elements. What does each of them means? <ol> represents an ordered list while <ul> represents an unordered list.

So what is this all?

In each of these examples, we demonstrated a few semantic elements in HTML. Each of the given elements essentially carries some meaning with it, which gives context to their usage in a document. For example, in the code below,

using the known semantics of <h1>, <p> and <ol>, we can immediately reason about the code. That is, we can tell what the code is meant to represent. We have a document with a main heading, followed by a short paragraph, followed by a list of items.

This is essentially semantic HTML.

Semantic HTML isn't a completely new idea in HTML; it has existed from day one, just not taken that way as it is now, especially since the advent of HTML.

However, this is NOT to say that every single element in HTML is semantic nature. Not at all.

Some HTML elements are purely meant for presentational concerns, such as <b> and <i>, carrying absolutely no semantic value with them. Similarly, some elements are purely just generic elements, not tied to a particular nature of content; specifically, these are the elements <div> and <span>.

Both <div> and <span> do NOT really classify as semantic HTML elements, in the sense in which we've discussed semantic HTML above, because they don't tell us what they're trying to represent.

For a reader of code, giving class to <div> and <span> might help him/her in understanding its intended semantics for the underlying document but still the element itself doesn't convey that information.

The <div> and <span> elements do have a meaning associated with them — a <div> represents a generic, block content, while <span> represents a generic, inline piece of content. It's just that this meaning isn't truly semantic in conveying a specific meaning for the content (e.g. a heading, a list, etc.).

Benefits of semantic HTML

At this stage, we have a sound idea of what exactly is semantic HTML. But we still don't know what exactly are its benefits.

Can you think of any?

Well, there are quite a few, as we discuss below.

Accessibility

A large part of advancements in web technologies and web development is dedicated to accessibility.

Accessibility talks about creating content and experiences for all kinds of people, including those with disabilities (such as cognitive, motor, sensory, etc.).

Many people view websites using assistive screen readers. These are specialized pieces of equipment that help people with cognitive disabilities understand the content of websites by means of reading it out loud.

And it's more than just reading out content — screen readers along with accessibility settings on browsers can be used to describe what a given piece of content is (e.g. a heading, a paragraph, a link, etc.)

Semantic HTML is HTML that carries meaning with it, right? This meaning is used by browsers and screen readers in creating accurate and more meaningful experiences for people using them. If we just use <div>s and <span>s throughout our documents where other elements make more sense, this will hurt the accessibility of our websites.

Browsers don't label the non-semantic HTML elements <div> and <span> to mean anything. Likewise when their content is read out by screen readers, it won't give the listeners a true sense of what that content actually represents in the HTML document.

For example, a heading denoted using a <div> might be styled with a large font size using CSS to look like a heading, however it won't create an accessible experience for the end user, for screen readers would parse it as just mere content and not as a heading.

SEO

This one is really important to note and must be read very carefully in order to avoid confusion as to how precisely semantic HTML benefits SEO.

If you don't know about it,

SEO stands for Search Engine Optimization and it refers to the practice of doing things in order to increase the ranking of websites on search engines.

The field of SEO is pretty huge and complex following from the very complexity of search engines themselves.

There are numerous factors that affect the rankings of websites on search engines and even how the websites appear in search engine results pages, or SERPs for short (for e.g. as special snippets or just as normal suggestions).

As we know by now, Semantic HTML conveys meaning to anyone reading the HTML. Therefore, it makes perfect sense to think that it might somehow affect SEO as well.

So does semantic HTML really affect SEO?

The real answer is that it totally depends on the search engine in question in how it treats semantic HTML (i.e. whether it counts a website with good content and semantic HTML to be potentially sent upwards in the rankings). We can't give a definitive answer to this.

Fortunately though, the biggest search engine, Google, has a lot to say about SEO and semantic HTML.

Semantic HTML and Google

In particular, for Google, semantic HTML does NOT really count as a direct ranking signal in evaluating the ranking at which a website must be shown.

However, semantic HTML might help the search engine crawler better understand the content and its structure, which might indirectly lead to a higher ranking owing to better comprehension of the content. (Notice the emphasis on the word 'might' here — talking about search engines from the factor of semantic HTML is not definitive.)

Semantic HTML can lead to a better showcase of a website in suggestions when it does show up. For example, a website might show up in Google's search results as a rich result by virtue of its content being marked up using semantic HTML.

But this is only about Google; other search engines may well have their own rules and regulations regarding semantic HTML.

Nonetheless, it won't be wrong to claim that semantic HTML does affect SEO in a good way, not necessarily in the ranking sense but in the sense of leading to better evaluated and understood content, which is vital for any search engine.

In short, semantic HTML has its effects on SEO.

To learn more about the benefits of semantic HTML in terms of accessibility and SEO, consider the article Why Accessible HTML Matters - Momentic Marketing.

Readable code

Even if we let go of the above two benefits, using semantic HTML would still have one benefit left and an equally important one: helping produce readable code.

As you start developing complex websites, you'll be writing a lot of HTML — like seriously, a lot of it. You'll be going through your own HTML from time to time and usually making changes to it every now and then when the needs of the underlying website or web app change.

And when this happens, the last thing you'll want to do is to be spending a substantial amount of time understanding the nature of content and then writing generic HTML.

If you go with semantic HTML from day one, such a scenario won't pop up, at least not for HTML. If the HTML is marked up using semantic elements, you won't have a hard time understanding the nature of your content.

In addition to this, when the time comes to read someone else's code, which is a very common thing when you're working in teams and corporate environments, you'll thank that other person for using semantic HTML, if he/she did.

You'll immediately be able to realize that, let's say, a given element represents a main heading, another one represents a paragraph, another one represents a dialog box, one represents the document's header, and so on and so forth.

As a more concrete example, let's consider two HTML code snippets, one without semantic elements (only sticking to <div>s and <span>s) and one with them. Just try to make sense of what the HTML is trying to showcase, only by reading the code.

Also note that the following code snippets use dummy, placeholder lorem ipsum text for the sake of keeping the focus on the meaning of the content by virtue of its containing element, not its text.

Here's the document without any semantic elements:

<div>
   <div>Lorem ipsum</div>
   <div>Tenetur dolorem quam, eum <span>excepturi reprehenderit</span> repellat, natus est explicabo quas impedit.</div>
</div>

<div>
   <div>Lorem ipsum dolor sit</div>
   <div>Aliquid cumque harum provident sunt laboriosam porro, quas, quia, explicabo doloribus perferendis earum reprehenderit. Deleniti eius esse blanditiis accusantium?</div>

   <div>Nihil repellat vel quae</div>
   <div>Aliquid cumque harum provident sunt laboriosam porro, quas, quia, explicabo doloribus perferendis earum reprehenderit. Deleniti eius esse blanditiis accusantium?</div>
</div>

<div>
   <div>Fugiat reiciendis totam ipsam tenetur eveniet.</div>
</div>

So what can you make of this code?

What does the first <div> represent? What does the last <div> container represent? What does the inner <span> in the third <div> represent?

Can't tell, right? Neither could we!

Here's the same document albeit with semantic elements:

<header>
   <h1>Lorem ipsum</h1>
   <p>Tenetur dolorem quam, eum <strong>excepturi reprehenderit</strong> repellat, natus est explicabo quas impedit.</p>
</header>

<article>
   <h2>Lorem ipsum dolor sit</h2>
   <p>Aliquid cumque harum provident sunt laboriosam porro, quas, quia, explicabo doloribus perferendis earum reprehenderit. Deleniti eius esse blanditiis accusantium?</p>

   <h2>Nihil repellat vel quae</h2>
   <p>Aliquid cumque harum provident sunt laboriosam porro, quas, quia, explicabo doloribus perferendis earum reprehenderit. Deleniti eius esse blanditiis accusantium?</p>
</article>

<footer>
   <p>Fugiat reiciendis totam ipsam tenetur eveniet.</p>
</footer>

Can you now tell what the document is meant to represent? Well, yes why not.

We have a header, followed by an article section, followed by a footer.

The header encapsulates the main heading (<h1>) of the document along with, possibly, a short description (given by the <p>). The footer possibly contains some disclaimer info for the document, or any other information in it. The element in between possibly defines the main content of the document, containing headings and paragraphs.

Obviously, the code doesn't give the complete picture of the nature of the underlying document, but at least it's much better than the previous code without any semantics whatsoever.

We will be exploring the new elements here, i.e. <header>, <article> and <footer>, in detail in the upcoming chapters, but we really don't need to know much about them to understand the HTML. The chosen nomenclature for the elements is good enough and pretty self-explanatory in describing their underlying semantics.

Talking specifically about the aforementioned code, <header> represents the document's header, <article> represents the main content, and <footer> represents the footer.

Pretty basic, isn't this?

Now ask yourself the question: Which code was more readable? The second one, right? That's all because of the semantic markup that it contains.

To boil it all down, semantic elements help produce meaningful HTML that:

  • leads to more accessible experience of the underlying document,
  • can be better evaluated by search engines to potentially affect SEO in a positive sense, and
  • is highly readable and self-explaining to a reader of the code.

Semantic elements in HTML5

As we stated above, semantic elements were always a part of HTML, right from day one.

But when <div> and <span> were introduced with HTML 4.0 as generic containers — block-level and inline, respectively — developers started to use them for all kinds of different purposes, e.g. to denote headers, footers, sections, addresses.

Despite the fact that these <div>s, together with the class attribute, were able to be used in all different kinds of styles and layouts (with the help of CSS) to represent different sections, HTML somewhat lost its trait of meaningfully marking up content.

This meant one thing: more elements could be added to HTML to help prevent such kinds of usages of <div> elements where, otherwise, semantic elements would really make more sense.

And so happened in HTML5.

HTML5 brought with it a surge of brand new, oven-fresh semantic elements, each representing a different kind of content in an HTML document.

Most of these elements were meant to provide, in addition to meaning, more structure to HTML documents by containing other elements.

For example, as we shall learn later on in this unit, the <header> element introduced in HTML5 is meant to contain other elements inside it that altogether represent the header of something; it itself doesn't denote any elementary piece of content in HTML like, let's say, an <h1>, <p> or <a>.

The term 'semantic HTML' really gained prominence following the HTML5 specification, thanks to its appreciation of the fact that the web needed more HTML elements to help convey the true meaning of content on it.

After HTML5's inception, the web gradually started to acknowledge and adopt these new semantic elements. Browsers began to support them, resources began to teach them, and developers began to use them.

The most notable ones of these semantic HTML elements introduced in HTML5, that we'll be covering in this unit, are detailed as follows:

ElementMeaning
<header>The header of the webpage or a given section.
<footer>The footer of the webpage or a given section.
<address>Contact details of an individual or company, including email addresses.
<article>A self-contained piece of content.
<section>A generic section of a webpage.
<aside>Content not directly related to other content on the webpage.
<main>The main content of a webpage.
<nav>Navigational content, containing links to other webpages.

Note that this isn't an exhaustive list by any means, since if we did give one, it would only unnecessarily complicate things for now.

We'll see all semantic HTML elements — in fact, all HTML elements — in this course so there's no need to worry about missing any element.

Be rest assured, you'll get to know them all!

"I created Codeguage to save you from falling into the same learning conundrums that I fell into."

— Bilal Adnan, Founder of Codeguage