No commit activity in last 3 years
No release in over 3 years
pass in a url or urls and mechanize-content will select the best block of text, image and title by analysing the page content
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Runtime

>= 0.1.1
>= 1.0.0
 Project Readme

mechanize-content¶ ↑

Returns the most important pieces of content on a web page. Finds the best block of text, image and title by analysing the page content.

Usage¶ ↑

Pass in a URL on initialisation and then call the helpers to pull the best content out.

mc = MechanizeContent::Parser.new("http://www.joystiq.com/2010/03/19/mag-gets-free-trooper-gear-pack-dlc-next-week/")

mc.best_title

"MAG gets free 'Trooper Gear Pack' DLC next week -- Joystiq"

mc.best_text

"Ten-hut, soldiers! HQ has just sent word that some new gear will be shipping to the front lines of MAG next week, free of charge: the Trooper Gear Pack. In this parcel, we'll finally get access to the Flashbang grenade..."

mc.best_image

"http://www.blogcdn.com/www.joystiq.com/media/2010/03/580mage302.jpg"

The gem also supports multiple URLs and will find the best content between them. The order in which they are inserted determines priority.

Dependancies¶ ↑

  • Mechanize

  • imagesize

Note on Patches/Pull Requests¶ ↑

  • Fork the project.

  • Make your feature addition or bug fix.

  • Add tests for it. This is important so I don’t break it in a future version unintentionally.

  • Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)

  • Send me a pull request. Bonus points for topic branches.

Copyright © 2010 John Griffin. See LICENSE for details.