Jekyll Pandoc Exports Plugin
A Jekyll plugin that automatically generates DOCX and PDF exports of your pages using Pandoc.
Features
- Generate Word documents (.docx) and PDFs from Jekyll pages, posts, and collections
- Configurable output directories for organized file management
- Incremental builds (only regenerate changed files)
- Automatic dependency validation (Pandoc/LaTeX)
- Configurable PDF options (margins, paper size, etc.)
- Automatic Unicode cleanup for LaTeX compatibility
- Configurable HTML cleanup patterns
- Auto-injection of download links
- Flexible image path fixing
- Print-friendly CSS class support
Installation
1. Install Dependencies
First, install Pandoc and LaTeX (for PDF generation):
# Ubuntu/Debian
sudo apt-get install pandoc texlive-latex-base texlive-fonts-recommended texlive-latex-extra
# macOS
brew install pandoc
brew install --cask mactex
2. Add to Gemfile
Add to your Jekyll site's Gemfile:
gem "jekyll-pandoc-exports"
3. Enable Plugin
Add to your _config.yml
:
plugins:
- jekyll-pandoc-exports
Usage
Basic Usage
Add front matter to any page you want to export:
---
title: My Document
docx: true # Generate Word document
pdf: true # Generate PDF
---
Configuration
Add configuration to your _config.yml
:
pandoc_exports:
enabled: true
output_dir: 'downloads' # Custom output directory (optional)
collections: ['pages', 'posts'] # Collections to process
incremental: true # Only regenerate changed files
debug: true # Enable debug logging
max_file_size: 10000000 # Max file size in bytes (10MB)
performance_monitoring: true # Log processing times
pdf_options:
variable: 'geometry:margin=0.75in'
pandoc_options: # Additional Pandoc options
toc: true
unicode_cleanup: true
inject_downloads: true
download_class: 'pandoc-downloads no-print'
template:
header: '<div class="export-header">Document Export</div>'
footer: '<div class="export-footer">Generated by Jekyll</div>'
css: '.export-header { font-weight: bold; margin-bottom: 20px; }'
title_cleanup:
- '<title>.*?</title>'
- '<h1[^>]*>.*?Site Title.*?</h1>'
image_path_fixes:
- pattern: 'src="/assets/images/'
replacement: 'src="{{site.dest}}/assets/images/'
Configuration Options
-
enabled
: Enable/disable the plugin (default: true) -
output_dir
: Custom output directory for exports (default: site root) -
collections
: Array of collections to process (default: ['pages', 'posts']) -
incremental
: Only regenerate files when source changes (default: false) -
debug
: Enable debug logging with detailed output (default: false) -
max_file_size
: Maximum file size in bytes before warning (default: 10MB) -
performance_monitoring
: Log processing times for each file (default: false) -
pdf_options
: Pandoc options for PDF generation (default: 1in margins) -
pandoc_options
: Additional Pandoc command-line options (default: {}) -
unicode_cleanup
: Remove problematic Unicode characters for LaTeX (default: true) -
inject_downloads
: Auto-inject download links into pages (default: true) -
download_class
: CSS class for download links (default: 'pandoc-downloads no-print') -
template
: Custom header, footer, and CSS for exports -
title_cleanup
: Array of regex patterns to remove from PDF HTML -
image_path_fixes
: Array of path replacements for images
Per-Page PDF Options
Override PDF options for specific pages:
---
title: My Document
pdf: true
pdf_options:
variable: 'geometry:margin=0.5in'
---
CSS for Print Hiding
Add to your main CSS to hide download links when printing:
@media print {
.no-print {
display: none !important;
}
}
Plugin Extensibility
Extend functionality with custom hooks:
# Register pre-conversion hook
Jekyll::PandocExports::Hooks.register_pre_conversion do |html_content, config, context|
# Modify HTML before conversion
html_content.gsub('old-class', 'new-class')
end
# Register post-conversion hook
Jekyll::PandocExports::Hooks.register_post_conversion do |content, format, config, context|
# Process converted content
if format == :pdf
# Custom PDF post-processing
end
content
end
Hook Context
Hooks receive context information:
-
format
: Output format (:docx or :pdf) -
filename
: Base filename being processed -
config
: Full plugin configuration
Performance Monitoring
Enable detailed statistics and timing:
pandoc_exports:
performance_monitoring: true
debug: true
Outputs conversion statistics:
- Success/failure rates
- Processing times per file
- Format-specific metrics
- Error summaries
Generated Files
The plugin generates files with the same name as your markdown file:
-
my-page.md
→my-page.docx
andmy-page.pdf
- Accessible at
/my-page.docx
and/my-page.pdf
Download Links
When inject_downloads
is enabled, the plugin automatically adds download links to pages that generate exports. Links are inserted after the first heading or at the beginning of the body.
CLI Usage
The plugin includes a command-line tool for standalone conversions:
# Convert single HTML file to both formats
jekyll-pandoc-exports --file page.html
# Convert to PDF only
jekyll-pandoc-exports --file page.html --format pdf
# Convert with custom output directory
jekyll-pandoc-exports --file page.html --output /tmp/exports
# Process entire Jekyll site
jekyll-pandoc-exports --source . --destination _site
# Enable debug output
jekyll-pandoc-exports --file page.html --debug
CLI Options
-
-f, --file FILE
: Convert single HTML file -
--format FORMAT
: Output format (docx, pdf, both) -
-o, --output DIR
: Custom output directory -
-s, --source DIR
: Jekyll source directory -
-d, --destination DIR
: Jekyll destination directory -
--debug
: Enable verbose debug output -
-h, --help
: Show help message
Publishing to RubyGems
If you want to publish this gem to RubyGems:
1. Build the gem:
gem build jekyll-pandoc-exports.gemspec
2. Test locally (optional):
gem install ./jekyll-pandoc-exports-1.0.0.gem
3. Publish to RubyGems:
# First time setup (if needed)
gem signin
# Publish the gem
gem push jekyll-pandoc-exports-1.0.0.gem
Gem Structure
-
jekyll-pandoc-exports.gemspec
- Gem specification with dependencies -
lib/jekyll-pandoc-exports.rb
- Main entry point -
lib/jekyll-pandoc-exports/version.rb
- Version management -
lib/jekyll-pandoc-exports/generator.rb
- Plugin code -
README.md
- Complete documentation -
LICENSE
- MIT license -
CHANGELOG.md
- Version history -
Gemfile
- Development dependencies -
Rakefile
- Build tasks
Troubleshooting
LaTeX Errors
- Ensure LaTeX packages are installed
- Enable
unicode_cleanup
to remove problematic characters - Add custom cleanup patterns to
title_cleanup
Image Issues
- Configure
image_path_fixes
for your site's image paths - Use absolute paths in the replacement patterns
Missing Files
- Check that Pandoc is installed and accessible
- Verify file permissions in the
_site
directory
Documentation
Complete documentation is available at: jekyll-pandoc-exports.readthedocs.io
- Installation Guide
- Quick Start Tutorial
- Configuration Reference
- Hooks System
- CLI Usage
- Testing Documentation
- Release Process
Contributing
- Fork the repository
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create a new Pull Request
See Development Guide for detailed contribution instructions.
License
This project is licensed under the MIT License - see the LICENSE file for details.