No commit activity in last 3 years
No release in over 3 years
A simple Ruby wrapper for the Mercury Web Parser API
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

Runtime

 Project Readme

Mercury Web Parser

A simple Ruby wrapper for the Mercury Web Parser API

Gem Version Build Status Code Climate

Installation

Add this line to your application's Gemfile:

gem 'mercury_web_parser'

And then execute:

$ bundle

Or install it yourself as:

$ gem install mercury_web_parser

Configuration

You must first obtain an API token from the fine folks at Mercury in order to make requests to their Web Parser API.

Single token usage

MercuryWebParser.api_token = API_TOKEN

or set multiple options with a block:

MercuryWebParser.configure do |parser|
  parser.api_token = API_TOKEN
end

Multiple tokens or multithreaded usage:

client = MercuryWebParser::Client.new(api_token: API_TOKEN)

Usage

Parse

Parse a webpage and return its main content:

article = MercuryWebParser.parse("http://sethgodin.typepad.com/seths_blog/2016/11/all-we-have-is-each-other.html")
=> #<MercuryWebParser::Article title="Seth's Blog", author=nil, date_published=nil, dek=nil, lead_image_url="http://www.sethgodin.com/sg/images/og.jpg", content="<div id=\"alpha-inner\" class=\"pkg\"> <div class=\"module-typelist module\">...", next_page_url="http://sethgodin.typepad.com/seths_blog/2016/11/choose-better.html", url="http://sethgodin.typepad.com/seths_blog/2016/11/all-we-have-is-each-other.html", domain="sethgodin.typepad.com", excerpt="", word_count=462, direction="ltr", total_pages=4, pages_rendered=4>

article.title
article.content
article.author
article.date_published
article.lead_image_url
article.dek
article.next_page_url
article.url
article.domain
article.excerpt
article.word_count
article.direction
article.total_pages
article.rendered_pages

Inspiration

Clone of readability_parser gem