Semantic HTML is about the purposeful relationships between document elements, instead of just describing how they should look on screen.

The role of structured articles

Google Assistant app on iPhone with the results of a “how do I pay a parking ticket in Boston” query, showing results only weakly related to the intended content.

Not one of the links provided in this Google Assistant results take me straight to the”How to Pay a Parking Ticket” webpage, nor do the descriptions definitely let me know I’m on the perfect track. (I did not inquire about asking a hearing) This is because the content on the City of Boston parking ticket page is designed to convey content relationships visually but isn’t structured semantically in a way that conveys those connections to algorithms that are curious.

Design practices which build bridges between consumer requirements and technology requirements to meet company goals are critical to making this vision a reality. Information architects, content strategists, programmers, and experience designers all have a role to play in providing and designing content options that are structured.

The business case for content that is structured design

In order to tailor results to these especially formulated queries, software agents have begun inferring intent and then using the data that was linked in their disposal to build a targeted, concise response. If I request Google Assistant what time Dr. Ruhlman’s office shuts, for instance, it responds,”Dr. Ruhlman’s office closes at 5 p.m.,” and displays this effect:
You will also observe that although Dr. Stacey Donion is an specific match in all of the listed search results–which can be a lot of enough to meet with the first results page–we’re revealed a”did you mean” connection for a different doctor. Multicare linked profiles that are data-rich to get their doctors and does provide semantic.
Than we see Boston the equivalent Google Assistant search , however, offers a more useful result. In this case, the Google Assistant result links right to the”Pay My Site” page and also lists several ways I will pay my ticket: on line, by email, and also in person.
Practitioners from the design community have shared a wealth of resources lately on creating material systems which work for algorithms and humans alike. To Find out More about implementing a content that is structured approach for your organization, these books and articles are a great place to start:

These elements, when designed communicate data hierarchy and relationships visually to readers, and semantically into algorithms. This arrangement allows Google Assistant to reasonably surmise that the text in those

headings represents payment options under the

going”Pay My Ticket.”
Such interactions that are fast, however, are only one small part of a much bigger problem: connected data is key to preserving the integrity of content online. The associations I have used as examples, such as the hospitals, colleges I’ve consulted with for decades, and government agencies, do not measure the success of the communications efforts in ad clicks or page views. Success for these means connecting community members, constituents, and patients with solutions and precise information regarding the organization, where that information might be found. This definition of success applies to virtually any type of company working to further its business goals on the web.

The City of Boston website's “How to Pay a Parking Ticket” page, showing a tabbed view of ways to pay and instructions for the first of those ways, paying online.
Google Assistant app on iPhone with the results of a “what time does dr. ruhlman office close” query. The results displayed include a card with “8:30AM–5:00PM” and the label, “Dr. Ruhlman Scott MD, Tuesday hours,” as well as links to call the office, search on Google, get directions, and visit a website. Additionally, there are four buttons labeled with the words “directions,” “phone number,” and “address,” and a thumbs-up emoji.

The prevalence of voice for a mode of access to information makes providing structured, machine-intelligible content more important. Voice and software agents that are intelligent are not just freeing users they’re changing user behavior. According to LSA Insider, there are several critical differences between voice inquiries and typed questions. Voice queries often be:
In addition to excerpting and finding info, for example parking ticket payment choices or recipe steps, applications and search representative algorithms also now aggregate content from several sources by using linked data.
Content is already a mainstay of many kinds of information about the web. Listings, for example, have been predicated on structured content for years. When I hunt, for example,”bouillabaisse recipe” on Google, I am provided with a standard list of links to recipes, as well as an Summary of recipe steps, a picture, and a pair of tags describing one instance recipe:
There’s not enough proof in this small sample to encourage a wide claim that algorithms have”cognitive” prejudice, but even when we allow for possibly confounding variables, we could observe the compounding problems we risk by dismissing structured content. “Donlon,” for instance, may well be a more common name than”Donion” and can be readily mistyped on a QWERTY keyboard. No matter the Kaiser Permanente outcome we are given above for Dr. Donion is to get the wrong physician. Furthermore, from the Google Assistant voice search, the discussion format doesn’t confirm if we meant Dr. Donlon; it just provides us with her centre’s contact info. In these scenarios, providing clear, machine-readable content may work to our advantage.

In a design process, the relationships between content chunks are specifically defined and described. This makes the material chunks as well as the connections between them legible to algorithms. Algorithms can then interpret a content package as the”page” I am searching for–or remix and adapt that same content to give me a list of instructions, the number of celebrities on a review, the amount of time left until an office closes, and some number of additional concise answers to specific questions.

HTML structured in this manner is both semantic and presentational because individuals understand what lists and headings look like and mean, and calculations can comprehend them as elements with defined, interpretable relationships.
The City of Seattle’s”Pay My Site” page, even though it lacks the glistening visual design of Boston’s site, also communicates parking ticket options obviously to individual visitors:
This”featured snippet” perspective is possible because the content publisher,, has broken this recipe in the smallest meaningful chunks suitable for this subject matter and audience, and then expressed information about these chunks as well as the connections between them at a machine-readable way. In this example, has utilized both semantic HTML and connected data to make this content not only a webpage, but also legible, accessible data that may be accurately interpreted, accommodated, and remixed by calculations and smart agents. Let’s look at each of these elements in turn to see how they work collectively across inference contexts, and indexing, aggregation.

Program broker hunt and semantic HTML

In late 2016, Gartner predicted that 30 percent of internet browsing sessions could be achieved without a screen by 2020. Even though there’s recent evidence to suggest that the 2020 picture might be more complex than these broad-strokes projections imply, we are already seeing the impact that voice hunt, artificial intelligence, and smart software agents like Alexa and Google Assistant are creating about the way information is found and consumed on the internet.

    The City of Seattle website‘s “Pay My Ticket” page, showing four methods to pay a parking ticket in a simple, all-text layout.

    By conveying in a digital context that currently includes aggregation and inference, organizations are more effectively able to consult with their customers where users really are, be it on a website, an internet search engine results page, or even a digital assistant. They’re also able to maintain control over their messages’ truth by ensuring the correct content hauled and can be found across contexts.
    The values that are machine-readable are shown by the pane on the right.

    The initial result, however, suggests that smart brokers might be at least partially susceptible to the same accessibility heuristic which affects individuals, wherein the information that’s simplest to remember often seems the most correct.

    Stacey Donion, the hunt for a second recommendation, MD, provides a very different experience. Like the City of Boston site above, Dr. Donion’s profile on the Kaiser Permanente website is perfectly intelligible to a sighted individual reader. However, since its markup is entirely presentational, its content is imperceptible to software agents.

    I understand what my options are for paying as this page being read by a human I can pay online, in person, on the telephone, or by mail. If I ask Google Assistant to pay a parking ticket but things get a bit confusing:

    Google search results page for Scott Ruhlman, MD, showing a list of standard links and an info box with an image, a map, ratings, an address, and reviews information.

    Getting started: who and how

    MultiCare Neuroscience Center, you’ll remember, is where Dr. Donlon–the neuroscientist Google thinks I may be looking for, not the orthopedic surgeon I’m really looking for–clinics. Dr. Donlon’s profile site, similar to Dr. Ruhlman’s, is semantically structured and marked up with linked data.

    Google search results page for Dr. Donion, showing a list of standard links for Dr. Donion, and a 'Did you mean: Dr Stacy Donlon MD' link at the top. There is a Google info box, as with the previous search results page example. But in this case the box does not display information about the doctor we searched for, Dr. Donion, but rather for 'Kaiser Permanente Orthopedics: Morris Joseph MD.'

    Along with the indexing purpose that conventional search engines function, smart agents and search calculations are bringing of obtaining advice: inference and aggregation, additional modes. As a result, design campaigns that focus on creating pages that are effective are no longer enough to guarantee accuracy or the integrity of articles published on the web. Instead, by focusing on providing access to data within a structured, systematic way that is legible to both humans and machines, content publishers can ensure that their content is both accessible and accurate in these new contexts, whether or not they’re creating chatbots or tapping to AI directly. In this article, we will look at the forms and effect of content, and we’ll close with a set of resources which can help you to get started using a content approach to information design.
    In this example, Dr. Ruhlman’s profile is marked up with microdata depending on the vocabulary. is a collaborative effort backed by Google, Yahoo, Bing, and Yandex that aims to create a frequent language for digital resources on the internet. This base provides the semantic base on. The Knowledge Graph info box, for instance, comprises Google testimonials, which are not part of Dr. Ruhlman’s profile, but that have been aggregated into this review. The overview also includes an interactive map, made possible because Dr. Ruhlman’s office place is machine-readable.
    In 2012, content strategist Karen McGrane wrote that”you do not have to determine which platform or device your customers use to get your content: they do.”

    Google Assistant app on iPhone with the results of a “how do I pay a parking ticket in Seattle” query, showing nearly the same results as on the desktop web page referenced above.

    Say, for example, that I want to gather more information about two recommendations I’ve been awarded for orthopedic surgeons.

    The City of Seattle website’s “Pay My Ticket” page, with the HTML heading elements outlined and labeled for illustration.

    Voice inquiries and articles inference

    Linked data and content aggregation

    Google Structured Data Testing tool, showing the markup for Dr. Ruhlman's profile page on the left half of the screen, and the structured data attributes and values for the structured content on that page on the right half of the screen.
  • Content Everywhere, Sara Wachter-Boettcher

These results are not just aggregated from disparate sources, but are translated and remixed to provide a answer to my specific question. Getting instructions, placing a telephone call, and obtaining Dr. Ruhlman’s profile page on are all at the tips of my hands.
If we conduct Dr. Ruhlman’s Swedish Hospital profile site through Google’s Structured Data Testing Tool, we can see that content about him is organised as small, different elements, each of which is marked up with descriptive types and characteristics that convey both the meaning of these traits’ values and the way they fit together as a whole–all in a machine-readable format.
In its simplest form, linked data is”a set of best practices for connecting structured data on the internet .” Linked data expands the basic capacities of semantic HTML by describing not just what kind of item a page component is (“Pay My Ticket” is an

), but in addition the real world concept that item represents: this

represents a”cover activity,” which inherits the structural attributes of”trade actions” (the exchange of goods and services for money) and”actions” (actions carried out by an agent upon an item ). Linked data creates a more nuanced description and it supplies the conceptual and structural advice that algorithms need to bring data together.
HTML markup which focuses only on the presentational facets of a”page” may seem perfectly fine to a human reader however be completely illegible to an algorithm. Take, as an instance, the City of Boston site, redesigned a few years ago in collaboration with top-tier design and development partners. If I Wish to find advice about how to pay a parking ticket, a connection from the home page takes me straight to the”How to Purchase a Parking Ticket” screen (scrolled to reveal detail):
A structured content design strategy frames articles tools –like recipes, articles, product descriptions, how-tos, profiles, etc.–not as pages to be found and read, but as bundles composed of small chunks of content information that relate to one another in meaningful ways.

A difference is read by the machine although every one of those elements would look the same to a sighted human creating this page. While WYSIWYG text entry fields can theoretically encourage semantic HTML that they fall prey to the idiosyncrasies of the most well-intentioned content writers. By making meaningful content structure that a core part of a site’s content management system, organizations can produce correct HTML for each element. This is also the base that makes it feasible to capitalize on the rich relationship descriptions given by data.
In this instance, we could see that Google is able to find plenty of links to Dr. Donion in its standard index results, but it isn’t able to”know” the data about those sources well enough to demonstrate an aggregated outcome. In cases like this, that the Knowledge Graph understands Dr. Donion is a Kaiser Permanente physician, but it pulls at the wrong place and the incorrect physician’s name in its attempt to build a Knowledge Graph screen.

The City of Seattle website’s 'Pay My Ticket' page, with two HTML heading elements outlined and labeled for illustration, and an open inspector panel, where we can see that the headings look the same to viewers but are marked up differently in the code.
Google search results info box for Dr. Ruhlman, showing an photo; a map; ratings; an address; reviews; buttons to ask a question, leave a review, and add a photo; and other people searched for.
A combined HTML code editor and preview window showing markup and results for heading, ordered list, and list item HTML tags.

Although this use of semantic HTML offers distinct benefits over the”page screen” styling we found on the Town of Boston’s website, the Seattle page also shows a weakness that is typical of guide approaches to semantic HTML. You’ll notice that, in the Google Assistant outcomes, the”Pay by Phone” option we found on the web page was not recorded. If we look at the markup of the page, we can observe that while the 3 options found by Google Assistant are wrapped in both and

tags,”Pay by Phone” is just marked up using an

. This irregularity in semantic arrangement could be what is causing this option to be omitted by Google Assistant from its own results.

The model of then expecting users parse and to detect these pages to answer questions and building pages in the pre-voice era, is becoming inadequate for communication. From participating in patterns of information discovery and seeking, it precludes associations. And–as we saw in the event of searching for information about doctors –it might lead software agents to make inferences based on incorrect or insufficient information routing customers to rivals who communicate more efficiently.
While I inquire Google Assistant what time Dr. Donion’s office closes, the result isn’t only less helpful but really points me in the wrong direction. Rather than a targeted selection of actions I’m presented to MultiCare Neuroscience Center.

Google Assistant app on iPhone with the results of a “what time does Doc Dr Stacey donion office close” query. The results displayed include a card with “8AM–5PM” and the label “MulitCare Neuroscience Center, Monday hours,” as well as links to call the office, search on Google, get directions, or visit a website.

Because it’s composed Regardless of the visual simplicity of the City of Seattle parking page, it ensures that the integrity of its material across contexts. “Pay My Ticket” is a level-one heading (

), and every one of the options below it are level-two headings (

), which indicate that they are inferior to the level-one component.
This announcement was meant to aid strategists, designers, and companies get ready for the imminent growth of mobile. It continues to ring true for connected data’s age. With the prevalence of queries and supporters, a company’s website is less inclined to become a visitor’s first encounter with content that is abundant and less. Such as hours, finding location information, telephone numbers, and evaluations –this participation may be a user interaction with an information source.