The project is in a healthy, maintained state
Automatically generate Word documents and PDFs from Jekyll pages with configurable options, Unicode cleanup, and auto-injected download links.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Development

~> 2.0
~> 5.0
~> 13.0

Runtime

>= 3.0
 Project Readme

Jekyll Pandoc Exports Plugin

A Jekyll plugin that automatically generates DOCX and PDF exports of your pages using Pandoc.

Features

  • Generate Word documents (.docx) and PDFs from Jekyll pages, posts, and collections
  • Configurable output directories for organized file management
  • Incremental builds (only regenerate changed files)
  • Automatic dependency validation (Pandoc/LaTeX)
  • Configurable PDF options (margins, paper size, etc.)
  • Automatic Unicode cleanup for LaTeX compatibility
  • Configurable HTML cleanup patterns
  • Auto-injection of download links
  • Flexible image path fixing
  • Print-friendly CSS class support

Installation

1. Install Dependencies

First, install Pandoc and LaTeX (for PDF generation):

# Ubuntu/Debian
sudo apt-get install pandoc texlive-latex-base texlive-fonts-recommended texlive-latex-extra

# macOS
brew install pandoc
brew install --cask mactex

2. Add to Gemfile

Add to your Jekyll site's Gemfile:

gem "jekyll-pandoc-exports"

3. Enable Plugin

Add to your _config.yml:

plugins:
  - jekyll-pandoc-exports

Usage

Basic Usage

Add front matter to any page you want to export:

---
title: My Document
docx: true    # Generate Word document
pdf: true     # Generate PDF
---

Configuration

Add configuration to your _config.yml:

pandoc_exports:
  enabled: true
  output_dir: 'downloads'           # Custom output directory (optional)
  collections: ['pages', 'posts']   # Collections to process
  incremental: true                 # Only regenerate changed files
  debug: true                       # Enable debug logging
  max_file_size: 10000000          # Max file size in bytes (10MB)
  performance_monitoring: true      # Log processing times
  pdf_options:
    variable: 'geometry:margin=0.75in'
  pandoc_options:                   # Additional Pandoc options
    toc: true
  unicode_cleanup: true
  inject_downloads: true
  download_class: 'pandoc-downloads no-print'
  template:
    header: '<div class="export-header">Document Export</div>'
    footer: '<div class="export-footer">Generated by Jekyll</div>'
    css: '.export-header { font-weight: bold; margin-bottom: 20px; }'
  title_cleanup:
    - '<title>.*?</title>'
    - '<h1[^>]*>.*?Site Title.*?</h1>'
  image_path_fixes:
    - pattern: 'src="/assets/images/'
      replacement: 'src="{{site.dest}}/assets/images/'

Configuration Options

  • enabled: Enable/disable the plugin (default: true)
  • output_dir: Custom output directory for exports (default: site root)
  • collections: Array of collections to process (default: ['pages', 'posts'])
  • incremental: Only regenerate files when source changes (default: false)
  • debug: Enable debug logging with detailed output (default: false)
  • max_file_size: Maximum file size in bytes before warning (default: 10MB)
  • performance_monitoring: Log processing times for each file (default: false)
  • pdf_options: Pandoc options for PDF generation (default: 1in margins)
  • pandoc_options: Additional Pandoc command-line options (default: {})
  • unicode_cleanup: Remove problematic Unicode characters for LaTeX (default: true)
  • inject_downloads: Auto-inject download links into pages (default: true)
  • download_class: CSS class for download links (default: 'pandoc-downloads no-print')
  • template: Custom header, footer, and CSS for exports
  • title_cleanup: Array of regex patterns to remove from PDF HTML
  • image_path_fixes: Array of path replacements for images

Per-Page PDF Options

Override PDF options for specific pages:

---
title: My Document
pdf: true
pdf_options:
  variable: 'geometry:margin=0.5in'
---

CSS for Print Hiding

Add to your main CSS to hide download links when printing:

@media print {
  .no-print {
    display: none !important;
  }
}

Plugin Extensibility

Extend functionality with custom hooks:

# Register pre-conversion hook
Jekyll::PandocExports::Hooks.register_pre_conversion do |html_content, config, context|
  # Modify HTML before conversion
  html_content.gsub('old-class', 'new-class')
end

# Register post-conversion hook
Jekyll::PandocExports::Hooks.register_post_conversion do |content, format, config, context|
  # Process converted content
  if format == :pdf
    # Custom PDF post-processing
  end
  content
end

Hook Context

Hooks receive context information:

  • format: Output format (:docx or :pdf)
  • filename: Base filename being processed
  • config: Full plugin configuration

Performance Monitoring

Enable detailed statistics and timing:

pandoc_exports:
  performance_monitoring: true
  debug: true

Outputs conversion statistics:

  • Success/failure rates
  • Processing times per file
  • Format-specific metrics
  • Error summaries

Generated Files

The plugin generates files with the same name as your markdown file:

  • my-page.mdmy-page.docx and my-page.pdf
  • Accessible at /my-page.docx and /my-page.pdf

Download Links

When inject_downloads is enabled, the plugin automatically adds download links to pages that generate exports. Links are inserted after the first heading or at the beginning of the body.

CLI Usage

The plugin includes a command-line tool for standalone conversions:

# Convert single HTML file to both formats
jekyll-pandoc-exports --file page.html

# Convert to PDF only
jekyll-pandoc-exports --file page.html --format pdf

# Convert with custom output directory
jekyll-pandoc-exports --file page.html --output /tmp/exports

# Process entire Jekyll site
jekyll-pandoc-exports --source . --destination _site

# Enable debug output
jekyll-pandoc-exports --file page.html --debug

CLI Options

  • -f, --file FILE: Convert single HTML file
  • --format FORMAT: Output format (docx, pdf, both)
  • -o, --output DIR: Custom output directory
  • -s, --source DIR: Jekyll source directory
  • -d, --destination DIR: Jekyll destination directory
  • --debug: Enable verbose debug output
  • -h, --help: Show help message

Publishing to RubyGems

If you want to publish this gem to RubyGems:

1. Build the gem:

gem build jekyll-pandoc-exports.gemspec

2. Test locally (optional):

gem install ./jekyll-pandoc-exports-1.0.0.gem

3. Publish to RubyGems:

# First time setup (if needed)
gem signin

# Publish the gem
gem push jekyll-pandoc-exports-1.0.0.gem

Gem Structure

  • jekyll-pandoc-exports.gemspec - Gem specification with dependencies
  • lib/jekyll-pandoc-exports.rb - Main entry point
  • lib/jekyll-pandoc-exports/version.rb - Version management
  • lib/jekyll-pandoc-exports/generator.rb - Plugin code
  • README.md - Complete documentation
  • LICENSE - MIT license
  • CHANGELOG.md - Version history
  • Gemfile - Development dependencies
  • Rakefile - Build tasks

Troubleshooting

LaTeX Errors

  • Ensure LaTeX packages are installed
  • Enable unicode_cleanup to remove problematic characters
  • Add custom cleanup patterns to title_cleanup

Image Issues

  • Configure image_path_fixes for your site's image paths
  • Use absolute paths in the replacement patterns

Missing Files

  • Check that Pandoc is installed and accessible
  • Verify file permissions in the _site directory

Documentation

Complete documentation is available at: jekyll-pandoc-exports.readthedocs.io

Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request

See Development Guide for detailed contribution instructions.

License

This project is licensed under the MIT License - see the LICENSE file for details.