Understanding Headless CMS architecture from an engineering (rather than a user) perspective

Question

TLDR:

In a Headless CMS (or a Decoupled CMS), the content retrieved by the front-end needs to be identifiable (somehow). This is where I'm stuck. I can describe my guesses of how platform-agnostic content might be made identifiable (see my attempt at a guess below). But I can't find real-world confirmation of tried and tested approaches anywhere, detailing how the decoupled front end can request the content from the content repository in a meaningful, indentifiable manner.

Where can I find a straightforward description of this core aspect of the mechanics of Headless CMS (or Decoupled CMS) architecture?

Recently I've been intrigued by the term Headless CMS.

There seem to be no shortage of articles and blog posts explaining:

What is a Headless CMS?
What's the difference between a Headless CMS and a Traditional CMS?

but the explanations always seem to be pitched at the reader who is planning to start using a Headless CMS, not at the engineer who wants to try their hand at writing a Headless CMS.

Eventually, after trying repeatedly to read between the lines of the articles I was reading, I grasped the single, fundamental innovation built into the Headless CMS:

in Headless CMS architecture, content and structure are completely separated

A conventional approach in web development is to maintain substantial separation between:

Structure (HTML)
Presentation (CSS)
Behaviour (JS)

In this model, the written and media Content isn't listed as a (fourth) separate concern because it's implicitly interwoven throughout the Structure.

e.g.:

You can take the HTML markup: <button type="button">Launch Jaguar Slideshow</button>
and present it as a big red button with capitalised white text and a drop-shadow using CSS
and enable it to trigger the creation and drop-down of a console, containing an animated slideshow using Javascript

But what you can't conventionally do is separate the textual content:

Launch Jaguar Slideshow

from the markup structure:

<button type="button"> ... </button>

But - if I understand correctly - this is what a Headless CMS enables:

on one page <button type="button"> ... </button> may contain: Launch Jaguar Slideshow
on another page, <button type="button"> ... </button> may contain: Launch Leopard Slideshow
on a third page, <button type="button"> ... </button> may contain: Launch Tiger Slideshow etc.

I'm not even sure that I've understood everything correctly up to here - and I've never used a templating language - but, my first (incidental) question:

Does this, essentially, make a Headless CMS the conceptual child of:

a Traditional CMS; and
Templating Languages such as Mustache, Handlebars.js, HAML, Pug, Slim, Nunjucks etc.

with the addition that whereas the templating languages above tend to work exclusively with HTML, the kind of templating engine in a Headless CMS will insert Content into any structure (ie. not just into HTML in a web document, but also into the XML of an RSS feed, or into a Social Media platform component, or into the UI structure of a Native App etc.)

and, my second, main question:

How on earth might one approach decoupling Content from Structure?

My best guess is something like the following JSON, where I have tried to express the relationship of the content only to itself (so it remains structurally agnostic and can be queried and return data to slot into any structure).

 {
   "Summary":{
      "Title":"Apples",
      "Created":"[TIMESTAMP HERE]",
      "Last Modified":"[TIMESTAMP HERE]",
      "ShortDesc":"An 8-10 word description of Apples here",
      "LongDesc":"A 20-30 word intro to Apples here."
   },

   "Related":{
       "Parents":[
          "Woodland_Fruit"
       ],
       "Siblings":[
          "Blackberries",
          "Cherries",
          "Pears"
       ],
       "Children":[
          "Granny Smith",
          "Braeburn",
          "Gala",
          "Red Delicious"
       ]
   },

   "Media":{
      "Images":{
         "Hero_1":{
            "Sizes":[

            ],
            "URL":"[URL HERE]",
            "Title":"Title Here",
            "Alt":"Alternative text here",
            "Created":"[TIMESTAMP HERE]",
            "Credits":{
               "Photographer":""
            },
            "Licence":{
               "Type":"",
               "URL":"",
               "Holder":""
            }
         },
         "Primary_1":{
            "Sizes":[

            ],
            "URL":"[URL HERE]",
            "Title":"Title Here",
            "Alt":"Alternative text here",
            "Created":"[TIMESTAMP HERE]",
            "Credits":{
               "Photographer":""
            },
            "Licence":{
               "Type":"",
               "URL":"",
               "Holder":""
            }
         },
         "Associated_1":{
            "etc.":"etc."
         },
         "Associated_2":{
            "etc.":"etc."
         }
      }
   },

   "Editorial":{
      "Primary":{
         "Title":"Hesperides and Beyond",
         "Author":"Ann Onne",
         "Created":"[TIMESTAMP HERE]",
         "Last Modified":"[TIMESTAMP HERE]",
         "Last_Modified":"[TIMESTAMP HERE]",
         "Sections":[
            {
               "Paragraphs" : [
                  {
                     "Paragraph": "[SECTION 1, PARAGRAPH 1 HERE]",
                     "Pull_Quotes": [
                         "PULLQUOTE HERE"
                     ]
                  },

                  {
                     "Paragraph": "[SECTION 1, PARAGRAPH 2 HERE]",
                     "Pull_Quotes": [
                        "PULLQUOTE HERE",
                        "PULLQUOTE HERE"
                     ]
                  }
               ]
            },

            {
               "Section_Heading" : "[SECTION HEADING HERE]",

               "Paragraphs" : [
                  {
                     "Paragraph": "[SECTION 2, PARAGRAPH 1 HERE]"
                  }
               ]
            }
         ]
      },

      "Secondary_1":{

         "Sections":[
            {
               "Section_Heading" : "[SECTION 1 HEADING HERE]",

               "Paragraphs" : [
                  {
                     "Paragraph": "[SECTION 1, PARAGRAPH 1 HERE]"
                  }
               ]
            },

            {
               "Section_Heading" : "[SECTION 2 HEADING HERE]",

               "Paragraphs" : [
                  {
                     "Paragraph": "[SECTION 2, PARAGRAPH 1 HERE]"
                  }
               ]
            },

            {
               "Section_Heading" : "[SECTION 3 HEADING HERE]",

               "Paragraphs" : [
                  {
                     "Paragraph": "[SECTION 3, PARAGRAPH 1 HERE]"
                  }
               ]
            }
         ]
      }
   }
}

(Hmmm. Does that begin to look like a pseudo-version of JSON-LD + Schema.org to you? Because it does to me...)

The JSON above describes the topic Apples:

It has 4 sections: Summary, Related, Media, Editorial
It indicates that the topic Apples has a parent topic, as well as sibling and children topics.
The topic has associated media (a hero image, a primary image and 2 more associated images)
The topic also has two Editorial "articles" - one is a proper article, one is supplementary information

So far, so good.

But most of this feels very much like guesswork.

I'd like to confirm that this sort of approach is on the right track.

Is this how I'm supposed to approach separating Content from Structure when building a Headless CMS?

Are you trying to build a headless CMS without knowing what it is? — Jacob Raihle, Apr 27 '20 at 12:36
@JacobRaihle - Yes. I am trying to understand what a **Headless CMS** is from a _Software Engineering_ perspective. Every time I try to look up how it differs from a _Traditional CMS_, I repeatedly find answers like: _"It's a content repository which delivers content through an API rather than delivering a structured UI."_ But I can never find a clear illustration of precisely _how_ **Content** might be separated from **Structure** (if that is what's happening). So my question, after I've spent a lot of time thinking about how this might be approached, is: _"Am I on the right track?"_ — Rounin, Apr 27 '20 at 13:19
Relevant to the question above: [The Overflow: _Is it time to give Drupal another look?_](https://stackoverflow.blog/2020/06/23/is-it-time-to-give-drupal-another-look/) — Rounin, Jun 26 '20 at 11:01
Also relevant to the question above: [Smashing Magazine: Strategies For Headless Projects With Structured Content Management Systems](https://www.smashingmagazine.com/2018/11/structured-content-done-right/) — Rounin, Jun 26 '20 at 11:05

score 1 · Answer 1 · answered Apr 27 '20 at 14:00

1

Content has structure, markup has structure, a website has structure. These are not necessarily related, but in a traditional CMS they tend to be intertwined. A headless CMS focuses on content only, but that content still has structure. Your JSON is a feasible representation of that content structure, for some types of content. Usually there will be a meta-structure which describes how content can be structured, and users of the CMS will be able to define their own content structures (or content models, content types, etc.) based on that.

It is not the headless CMS's job to insert content into your website or anything else. It simply provides the content, usually over an API, and doesn't need to know how it is used. Thus, in contrast to a traditional CMS, a headless CMS wouldn't have a templating language.

answered Apr 27 '20 at 14:00

Jacob Raihle

1,692
13
14

Many thanks for this, @JacobRaihle. I'm continuing to read and I now see that there is a term _Decoupled CMS_ (or _Hybrid Headless CMS_) which is perhaps closer to how I initially understood the concept of "Headless" than a pure _Headless CMS_. – Rounin Apr 27 '20 at 14:13
Your reference to a _meta-structure_ is helpful. Initially I thought, if there is a document heading: `
Main Heading
` and we strip out the markup... how can any script interpret `Main Heading` as anything at all? Unless it's labelled as something like `{ "Primary_Heading" : "Main Heading" }`... but then all we've done is replaced `
...
` with `Primary Heading` (?) Perhaps an `
` is an atypical example because there tends to be only one of these per document and it has a very explicit role?
– Rounin Apr 27 '20 at 14:13
Otherwise,I can't really get my head around how, if we were to strip out the markup (which implicitly labels the content) and then don't replace the now-removed-labels with any other labels... how can the content be understood in any sense at all? – Rounin Apr 27 '20 at 14:15
@Rounin you need some application (website, mobile app, whatever) to present the content to users in an understandable way. All those applications need to agree (together with the headless CMS) on what the content structure is. – Jacob Raihle Apr 27 '20 at 14:28
Re: _"All those applications need to agree (together with the headless CMS) on what the content structure is."_ This is where I find the articles I've read start to get really vague. If an article has a paragraph which is marked up as `
[CONTENT HERE]
` in a web page, then the page template needs to fetch the `[CONTENT HERE]` from somewhere in the content repository. So that content can't simply be floating in the repository without any label at all (?) [1/2] – Rounin Apr 27 '20 at 19:47
The content needs to be identifiable. This is where I'm stuck. I can guess at what the identification process might be. I can imagine how an identification process _might_ work (see my question, above). But I can't confirm anywhere how a _Headless CMS_ or _Decoupled CMS_ actually identifies the content. [2/2] – Rounin Apr 27 '20 at 19:47
@Rounin you should try using a headless CMS and most of this should become clear very quickly. – Jacob Raihle Apr 28 '20 at 06:08
Heh. Thanks, @JacobRaihle. I'm just trying to find a written explanation confirming the mechanics involved. (Either here at SE Software Engineering, or elsewhere on the web in a blog post or article.) – Rounin Apr 28 '20 at 09:36

Understanding Headless CMS architecture from an engineering (rather than a user) perspective

How on earth might one approach decoupling Content from Structure?

1 Answers1

Main Heading

...

` is an atypical example because there tends to be only one of these per document and it has a very explicit role?