Project

makuri

0.0
The project is in a healthy, maintained state
Web-crawling framework for Ruby
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
 Project Readme

Makuri

Makuri is a Web-crawling framework for Ruby.

Usage

In this example, we are going to crawl the quotes website and scrape data as:

# quotes_spider.rb
require 'makuri'

class QuotesSpider
  include Makuri::Spider
  start_urls ['https://quotes.toscrape.com/tag/humor/']

  def parse
    response.css('div.quote').each { |quote| extract(quote) }

    next_page = response.at_css('li.next>a')
    request_to :parse, url: next_page[:href] unless next_page.nil?
  end

  def extract(quote)
    item = {
      author: quote.at_css('span>small').text,
      text: quote.at_css('span.text').text
    }

    puts item.to_json
  end
end

QuotesSpider.run

Now save the file to quotes_spider.rb file and run it as:

$ ruby quotes_spider.rb > quotes.json

When it's done, you will find all the quotes saved to quotes.json file. It's that easy.

History

View the changelog

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help:

To get started with development:

$ git clone https://github.com/lalusaud/makuri.git
$ cd makuri
$ bundle install
$ bundle exec rake test