0.0
No commit activity in last 3 years
No release in over 3 years
Parse EPUB 3 book loosely
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

Runtime

 Project Readme

EPUB Parser

Build Status Dependency Status

INSTALLATION

gem install epub-parser  

USAGE

As a library

require 'epub/parser'

book = EPUB::Parser.parse('book.epub')
book.metadata.titles # => Array of EPUB::Publication::Package::Metadata::Title. Main title, subtitle, etc...
book.metadata.title # => Title string including all titles
book.metadata.creators # => Creators(authors)
book.each_page_on_spine do |page|
  page.media_type # => "application/xhtml+xml"
  page.entry_name # => "OPS/nav.xhtml" entry name in EPUB package(zip archive)
  page.read # => raw content document
  page.content_document.nokogiri # => Nokogiri::XML::Document. The same to Nokogiri.XML(page.read)
  # do something more
  #    :
end

See document's {file:docs/Home.markdown} or API Documentation for more info.

epubinfo command-line tool

epubinfo tool extracts and shows the metadata of specified EPUB book.

$ epubinfo ~/Documebts/Books/build_awesome_command_line_applications_in_ruby.epub
Title:              Build Awesome Command-Line Applications in Ruby (for KITAITI MAKOTO)
Identifiers:        978-1-934356-91-3
Titles:             Build Awesome Command-Line Applications in Ruby (for KITAITI MAKOTO)
Languages:          en
Contributors:       
Coverages:          
Creators:           David Bryant Copeland
Dates:              
Descriptions:       
Formats:            
Publishers:         The Pragmatic Bookshelf, LLC (338304)
Relations:          
Rights:             Copyright © 2012 Pragmatic Programmers, LLC
Sources:            
Subjects:           Pragmatic Bookshelf
Types:              
Unique identifier:  978-1-934356-91-3
Epub version:       2.0

See {file:docs/Epubinfo} for more info.

epub-open command-line tool

epub-open tool provides interactive shell(IRB) which helps you research about EPUB book.

epub-open path/to/book.epub

IRB starts. self becomes the EPUB book and can access to methods of EPUB.

title
=> "Title of the book"
metadata.creators
=> [Author 1, Author2, ...]
resources.first.properties
=> #<Set: {"nav"}> # You know that first resource of this book is nav document
nav = resources.first
=> ...
nav.href
=> #<Addressable::URI:0x15ce350 URI:nav.xhtml>
nav.media_type
=> "application/xhtml+xml"
puts nav.read
<?xml version="1.0"?>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">
    :
    :
    :
</html>
=> nil
exit # Enter "exit" when exit the session

See {file:docs/EpubOpen} for more info.

REQUIREMENTS

  • Ruby 1.9.3 or later
  • C compiler to compile Zip/Ruby and Nokogiri

Related Gems

  • gepub - a generic EPUB library for Ruby
  • epubinfo - Extracts metadata information from EPUB files. Supports EPUB2 and EPUB3 formats.
  • ReVIEW - ReVIEW is a easy-to-use digital publishing system for books and ebooks.
  • epzip - epzip is EPUB packing tool. It's just only doing 'zip.' :)
  • eeepub - EeePub is a Ruby ePub generator
  • epub-maker - This library supports making and editing EPUB books based on this EPUB Parser library

If you find other gems, please tell me or request a pull request.

RECENT CHANGES

0.1.6

  • Remove EPUB.parse method
  • Remove EPUB::Publication::Package::Metadata#to_hash
  • Add EPUB::Publication::Package::Metadata::Identifier
  • Remove MethodDecorators::Deprecated
  • Make EPUB::Parser::OCF::CONTAINER_FILE and other constants deprecated
  • Make EPUB::Publication::Package::Metadata::Link#rel a Set
  • Add exception class EPUB::Constants::MediaType::UnsupportedMediaType
  • Make EPUB::Constants::MediaType::UnsupportedError deprecated
  • Add EPUB::Publication::Package::Item#find_item_by_relative_iri
  • Add EPUB::Publication::Package::Item#cover_image?
  • Add EPUB::Book::Features module and move methods of EPUB module to it.(Thanks, takahashim!)
  • Make including EPUB deprecated
  • Parse hidden attribute of nav elements
  • [Experimental]Add EPUB::ContentDocument::Navigation::Item#traverse

0.1.5

  • Add ContentDocument::XHTML#title
  • Add Manifest::Item#xhtml?
  • Add --words and --char options to epubinfo command
  • API change: OCF::Container::Rootfile#full_path became Addressable::URI object rather than String
  • Add ContentDocument::XHTML#rexml and #nokogiri
  • Inspect more readably

0.1.4

  • Fixed-Layout Documents support
  • Define ContentDocument::XHTML#top_level?
  • Define Spine::Itemref#page_spread and #page_spread=
  • Define some utility methods around Manifest::Item and Spine::Itemref

See {file:CHANGELOG.markdown} for older changelogs and details.

TODOS

  • EPUB 3.0.1
  • Multiple rootfiles
  • Help features for epub-open tool
  • Vocabulary Association Mechanisms
  • Implementing navigation document and so on
  • Media Overlays
  • Content Document
  • Digital Signature
  • Using SAX on parsing
  • Extracting and organizing common behavior from some classes to modules
  • Abstraction of XML parser(making it possible to use REXML, standard bundled XML library of Ruby)
  • Handle with encodings other than UTF-8

DONE

  • Simple inspect for epub-open tool
  • Using zip library instead of unzip command, which has security issue
  • Modify methods around fallback to see bindings element in the package
  • Content Document(only for Navigation Documents)
  • Fixed Layout
  • Vocabulary Association Mechanisms(only for itemref)

LICENSE

This library is distribuetd under the term of the MIT License. See MIT-LICENSE file for more info.