0.0
No commit activity in last 3 years
No release in over 3 years
Takes a folder structure of PDF files and generates thumbnail images for all pages of all pdfs.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

~> 1.12
~> 10.0
~> 3.0
 Project Readme

pdf_thumbnailer

pdf_thumbnailer converts a folder structure containing .pdf files into a folder structure of .png files. Each .png represents a single page of the pdf and can be scaled to one or more pixel widths/heights.

Requirements and Dependencies

Developed/Tested with Ruby version 2.3, but it should work with any version >= 1.9. Page image creation is done through the pdftoppm utility, which can be installed on *NIX (Linux/OSX) operating systems.

Installation

Add to your Gemfile:

gem 'pdf_thumbnailer', '~> 1.0'

Usage

First, configure PdfThumbs:

PdfThumbs.configure(
  pdf_dir: '/some/path/with/pdf/files',
  img_dir: '/where/to/save/page/images', # required
  thumb_sizes: [1000, 500, 100] # max pixel lengths for the long side (height or width) of the page images
)

To generate the page images, execute .thumbnail!:

PdfThumbs.thumbnail! do |thumb_dir|
  puts "Page thumbnails are located in: #{thumb_dir}"
end

thumbnail! optionally accepts a code block with one argument. This argument (thumb_dir in the above example) will be a string representing the path of the newly created directory containing page images for a particular pdf. The folder structure img_dir will be identical to that of pdf_dir but for every .pdf file there will instead be a directory of the same name containing .png iamges.

Alternatively a single pdf file can have its pages converted to images using the thumbnail_single! method. Note that this method will return the number of pages in the pdf that were converted to images:

PdfThumbs.configure img_dir: '/where/to/save/page/images', thumb_sizes: 500

num_pages = PdfThumbs.thumbnail_single! '/pdf/root/dir', 'relative/path/to/My File.pdf'

puts "Number of pages extracted = #{num_pages}"

License

MIT-LICENSE.