Project

pdftohtmlr

0.02
No commit activity in last 3 years
No release in over 3 years
Uses command-line pdftohtml tools to convert PDF files to HTML.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Runtime

>= 1.3.3
 Project Readme

pdftohtmlr

Wrapper around the command line tool pdftohtml which converts PDF to HTML, go figure.

This gem was inspired by the MiniMagick gem – which does the same thing for ImageMagick (thanks Corey).

Requirements

Just pdftohtml and Ruby (1.8.6+ as far as I know).

On Mac:

brew install pdftohtml

On Ubuntu:
It should be installed by default with the ‘poppler-utils’ package.

Install

http://gemcutter.org/gems/pdftohtmlr

gem install pdftohtmlr

Using

gist examples

require 'pdftohtmlr'
require 'nokogiri'
include PDFToHTMLR
file = PdfFilePath.new([Path to Source PDF])
string = file.convert
doc = file.convert_to_document()

See included test cases for more usage examples, including passwords and URL fetching.

license

MIT (See included MIT-LICENSE)