Project

caule

0.0
No commit activity in last 3 years
No release in over 3 years
DSL to build crawlers easily
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

~> 1.0.0
~> 2.9.0
~> 1.8.6

Runtime

~> 2.3
 Project Readme

Caule

A simple DSL, based on Anemone, to build web crawlers easily.

Build Status

Installation

Add this line to your application's Gemfile:

gem 'caule'

And then execute:

$ bundle

Or install it yourself as:

$ gem install caule

Usage

Caule.start("http://rubygems.org/") do |crawler|
  crawler.on_every_page do |page| # page is a Mechanize::Page instance
    puts page.uri
  end

  crawler.on_pages_like(/\//) do |page| # runs only on the home page
    puts page.uri
  end

  crawler.focus_crawl do |page| # filter the links that the bot must crawl
    page.links_with(:href => /\/gems/)
  end
end

Contributing

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Added some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

License

See LICENSE file