Jekyll AI Related
A Jekyll command plugin to generate list of related posts using AI.
The plugin uses uses OpenAI API to generate embeddings from posts content, and stores them on Supabase vector database. A query on Supabase provides a similarity score between the post and all other posts on the site. The plugin then selects the most similar posts to generate the list of related posts.
To avoid unnecessary API calls or slowing down the website generation, the plugin stores the list of related posts in the site's _data folder. This way you can jekyll build without calling the plugin or making API calls, while still having the list of related posts available as site data. You only need to re-run it when you add or update posts.
Getting started
- Setup Supabase
- Install the plugin
- Configure the plugin
- Run the plugin
Requirements
- OpenAI account and API key
- Supabase account and API key
- Supabase URL of the project where the DB is located
Supabase setup
Use the create.sql script in the sql\supabase folder of this repo to create the required table and functions on your Supabase project. You can run the script in the SQL editor of the Supabase dashboard.
You can always revert the changes by running the drop.sql script.
Installation
- Add the plugin to you Jekyll site's
Gemfilein the:jekyll_pluginsgroup:
group :jekyll_plugins do
gem 'jekyll-ai-related'
end
- Run
bundle install
Install from git
Alternatively, you can get code straight from this repository. Code from main branch should be stable enough but may contain unreleased software with bugs or breaking changes. Unreleased software should be considered of beta quality.
group :jekyll_plugins do
gem 'jekyll-ai-related', git: 'https://github.com/pirafrank/jekyll-ai-related', branch: 'main'
end
Update
bundle update jekyll-ai-relatedConfiguration
Sample configuration to add to your _config.yml:
jekyll-ai-related:
post_unique_field: slug
post_updated_field: date
output_path: related_posts
include_drafts: false
include_future: false
related_posts_limit: 3
related_posts_score_threshold: 0.5
precision: 3
db_table: page_embeddings
db_function: cosine_similarityConfiguration is optional. The plugin will use the default values if not provided.
Configuration options
| Option | Type | Description | Default Value |
|---|---|---|---|
post_unique_field |
string | A field that uniquely identifies a post | slug |
post_updated_field |
string | A field that indicates when the post was last updated | date |
output_path |
string | The subdir inside _data where related posts will be stored |
related_posts |
include_drafts |
boolean | Whether to include drafts in the list of related posts | false |
include_future |
boolean | Whether to include future posts in the list of related posts | false |
related_posts_limit |
integer | The maximum number of related posts to extract per post | 3 |
related_posts_score_threshold |
float | The minimum similarity score to consider a post related | 0.5 |
precision |
integer | The number of decimal digits to round similarity scores to | 3 |
db_table |
string | The name of the table where the embeddings are stored | page_embeddings |
db_function |
string | The name of the function to calculate similarity scores | cosine_similarity |
Note
related_posts_limit and related_posts_score_threshold are used to filter the list of related posts. The plugin will return the top related_posts_limit posts with a similarity score greater than related_posts_score_threshold. A post may have 0 or more than related_posts_limit related posts, but only the top related_posts_limit will be returned, if any.
Advanced configuration
You can customize the unique and updated fields, for example if you have a plugin that generates a unique ID for each post, or if you have a custom field that indicates when the post was last updated.
Check the post-metadata.rb plugin of mine for an example. You can download and place it in your _plugins folder. It will add metadata to each post object and makes it accessible anywhere the post object is.
If you do so, you could use the following configuration:
jekyll-ai-related:
post_unique_field: uid
post_updated_field: most_recent_editUsage
To run, the plugin needs the OpenAI and Supabase API keys, and Supabase URL. The plugin expects them as environment variables.
export OPENAI_API_KEY="your_openai_api_key"
export SUPABASE_URL="your_supabase_url"
export SUPABASE_KEY="your_supabase_api_key"Never write API keys in your code or configuration files you commit.
Then you can run the plugin with:
bundle exec jekyll relatedImportant
Re-run the plugin whenever you add or update posts to update the list of related posts. Only posts with a post_updated_field date greater than the last run will be processed.
Options
You can pass options to the plugin to change its behavior. Options are passed after the command name. Not specifying an option is the same as specifying the default value.
| Option | Description | Default |
|---|---|---|
--debug |
Most verbose. Sets log level to Debug. | Jekyll default |
--quiet |
Do not print Info logs. Sets log level to Error. | Jekyll default |
--future |
Generate embeddings and find related posts also for those with a future date. | false |
--drafts |
Generate embeddings and find related posts also for drafts. | false |
--dry-run |
Do not update the database, do not write related posts to disk. | false |
Example:
bundle exec jekyll related --dry-run --futureEnvironments
Jekyll AI Related can optionally use the JEKYLL_ENV environment variable to determine where is running. This allows you to have separate tables and functions for different environments, for example for development and production.
When JEKYLL_ENV is set, the plugin will append its value to db_table and db_function configuration options, separated by a _ character. If it is not set, the plugin won't append anything. You must set JEKYLL_ENV every time you run the plugin to use environments.
Important
You need to make sure you have the corresponding tables and functions on your Supabase project. Edit the committed SQL script accordingly. They work with no modification when JEKYLL_ENV is not set.
For example, if you export JEKYLL_ENV=production and have the configuration below
jekyll-ai-related:
db_table: cool_table_name
db_function: cool_function_namethe plugin will use:
-
cool_table_name_productionas the database table name, -
cool_function_name_productionas the database function name.
Tip
Explicitly declare JEKYLL_ENV=production to avoid overwriting production data by mistake.
Jekyll integration
The plugin generates a list of related posts for each post in the site. Yet it won't edit your posts or layouts to display the list. It's up to you to integrate it in your site.
You can access the list in the site.data.YOUR_output_path_NAME object. For example, with the default settings you can access it with site.data.related_posts[post.slug].
Here's a snippet to add it to the bottom of your _layouts/post.html layout file. It works with the default configuration.
{% if site.data.related_posts[page.slug] %}
<div class="related-posts">
<h3>Related Posts</h3>
<ul>
{% for post in site.data.related_posts[page.slug] %}
<li>
<a href="{{ post.url }}">{{ post.title }}</a>
<br/>{{ post.date | date: "%B %d, %Y" }}
</li>
{% endfor %}
</ul>
</div>
{% endif %}Note
The plugin writes posts in descending order of similarity score, from most similar to least similar. You can change the order by reversing the loop in the snippet above.
Development
Clone and run bundle install to get started.
Code lives in lib/jekyll directory. lib/jekyll/commands/generator.rb is the entry point of the plugin, as per Jekyll documentation. More details here.
Contributing
Bug reports and pull requests are welcome on GitHub.
Code of Conduct
Everyone interacting in the project's codebase, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.
Guarantee
This plugin is provided as is, without any guarantee. Use at your own risk.
Disclaimer
This plugin uses OpenAI API and Supabase, and you are responsible for complying with their respective terms of service. This plugin is not affiliated with OpenAI or Supabase.
License
The gem is available as open source under the terms of the MIT License.