Project

qpdf_ruby

0.0
No release in over 3 years
This gem provides a comprehensive Ruby wrapper around
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Development

>= 0
~> 1.8
~> 13.0
~> 3.0
~> 1.21
 Project Readme

QpdfRuby

Patch & polish PDFs so that PAC 2024 finally turns green.

QpdfRuby is a very small Ruby wrapper around the battle‑tested QPDF >= 12 C++ library. Right now the library focuses on only three specialised tasks that are needed when PDFs are printed from Chromium‑based browsers and subsequently audited with the PAC 2024 accessibility checker:

  1. Export the structure tree as XML – handy for debugging.
  2. Mark vector path objects as /Artifact so that decorative lines, boxes, &c. are ignored by assistive technologies.
  3. Add missing /BBox entries to every /Figure element (derived from the page’s graphic operators) so that screen readers know the physical extent of each image.

Together these tweaks eliminate the most common complaints PAC 2024 has about browser‑generated PDFs.


Features in Detail

Feature Ruby API
Dump structure tree as XML doc.show_structure
Mark path objects ( re … S/s/f/F/B/b ) doc.mark_paths_as_artifacts
Ensure /Figure elements have a layout BBox¹ doc.ensure_bbox

¹Internally the gem parses each page’s content stream, maps image /MCIDs to their transformation matrix, computes the bounding box (courtesy of a little linear algebra) and finally writes the result into the structure tree.


Installation

Requirements

  • Ruby >= 3.1
  • QPDF >= 12.0.0 (headers & libs)

macOS

brew install qpdf
bundle config set --local build.qpdf_ruby "--with-qpdf-dir=$(brew --prefix qpdf)"

Debian/Ubuntu

# on Debian 11/Ubuntu 20.04 you may need newer packages from testing
sudo apt-get update && sudo apt-get install -y libqpdf-dev qpdf

If apt cannot provide QPDF ≥ 12 you can compile it yourself or pull the package from testing/unstable – see the Dockerfile for a working apt preferences snippet.

Add the gem

bundle add qpdf_ruby
# …or without bundler:
# gem install qpdf_ruby -- --with-qpdf-include=/usr/local/include/qpdf --with-qpdf-lib=/usr/local/lib

Quick Start

require "qpdf_ruby"

pdf = QpdfRuby::Document.new("input.pdf")

# 1. tag decorative paths
pdf.mark_paths_as_artifacts

# 2. add BBox to every <Figure>
pdf.ensure_bbox

# 3. introspect structure tree (optional)
File.write("structure.xml", pdf.show_structure)

# 4. save 🎉
pdf.write("fixed.pdf")

Run PAC 2024 on fixed.pdf – it should report far fewer (or zero!) errors compared to the original browser output.


Development

git clone https://github.com/dieter-medium/qpdf_ruby.git
cd qpdf_ruby
bin/setup        # install gem + test deps
autotest         # guard & RSpec
  • Bump version.rbbundle exec rake release to push a new gem.

Testing with local QPDF builds

If you tinker with QPDF itself, point Bundler to your custom prefix:

bundle config set --local build.qpdf_ruby "--with-qpdf-include=$HOME/opt/qpdf/include --with-qpdf-lib=$HOME/opt/qpdf/lib"

Roadmap

TBD


Contributing

Bug reports & pull requests are welcome at https://github.com/dieter-medium/qpdf_ruby.

Code Style

  • C++ 17, clang‑format enforced
  • Ruby 3.2, rubocop default rules

License

MIT – see LICENSE.txt for full text.