jekyll-third-audience
A Jekyll plugin that generates clean Markdown copies of blog posts alongside their HTML output, making your content accessible to AI agents — the "third audience" of the web.
In this case, "clean" means, "Removing includes and other things that match regex patterns you provide."
Will this reduce AI traffic to your site? No, in fact it will likely increase it, as bots will have to read the HTML page to learn the Markdown version exists. But I like Markdown and I wish more sites provided Markdown versions, for human readers as well as robots.
Inspired by Dries Buytaert's article on the 'third audience'.
What it does
-
Generates
.mdfiles — After Jekyll builds your site, the plugin writes a clean Markdown version of each post next to its HTML file. The Markdown version has structured front matter and the raw body content with no HTML, no includes, no Liquid tags. -
Adds
<link>tags — The{% third_audience_meta %}Liquid tag outputs a<link rel="alternate" type="text/markdown">tag in your HTML<head>, telling AI agents where to find the Markdown version.
Installation
Add to your Gemfile:
gem "jekyll-third-audience"Add to your _config.yml:
plugins:
- jekyll-third-audienceAdd to your _includes/head.html (or equivalent):
{% third_audience_meta %}Run bundle install.
Configuration
Add a third_audience block to _config.yml. All settings are optional — these are the defaults:
third_audience:
layouts:
- post
front_matter:
- title
- date
- author
- description
- tags
- url
strip_includes: []
replace_includes: []layouts
Which layouts get .md versions and <link> tags. Default: ["post"].
front_matter
Which fields to include in the Markdown file's YAML front matter. Default: ["title", "date", "author", "description", "tags", "url"].
strip_includes
List of include filenames to remove entirely from the Markdown output:
third_audience:
strip_includes:
- email_subscribe.html
- contact_form.htmlThis removes any {% include email_subscribe.html %} or {% include contact_form.html %} tags from the Markdown body.
replace_includes
List of regex pattern/replacement rules for transforming includes:
third_audience:
replace_includes:
- pattern: "\\{%\\s*include\\s+emoji_break\\.html.*?%\\}"
replacement: "\n---\n"Example
Given a post at _posts/2024-01-15-my-post.md, after jekyll build you'll find:
-
_site/2024/01/15/my-post.html— the normal HTML output -
_site/2024/01/15/my-post.md— clean Markdown with structured front matter
The Markdown file looks like:
---
title: "My Post Title"
date: 2024-01-15
author: Your Name
description: "A description of the post"
tags: ["jekyll", "markdown"]
url: https://example.com/2024/01/15/my-post.html
---
The raw markdown body of your post, with all includes
stripped or replaced per your configuration.License
MIT