Project

makuri

0.0
No release in over 3 years
Low commit activity in last 3 years
Web-crawling framework for Ruby
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

~> 2.0
~> 3.11

Runtime

 Project Readme

Makuri

Makuri is a Web-crawling framework for Ruby.

Install

Add this to your application's Gemfile

gem 'makuri'

And execute

$ bundle

Or install it as:

$ gem install makuri

Usage

In this example, we are going to crawl the quotes website and scrape data as:

# quotes_spider.rb
require 'makuri'

class QuotesSpider
  include Makuri::Spider
  start_urls ['https://quotes.toscrape.com/tag/humor/']

  def parse
    response.css('div.quote').each { |quote| extract(quote) }

    next_page = response.at_css('li.next>a')
    request_to :parse, url: next_page[:href] unless next_page.nil?
  end

  def extract(quote)
    item = {
      author: quote.at_css('span>small').text,
      text: quote.at_css('span.text').text
    }

    puts item.to_json
  end
end

QuotesSpider.run

Now save the file to quotes_spider.rb file and run it as:

$ ruby quotes_spider.rb > quotes.json

When it's done, you will find all the quotes saved to quotes.json file. It's that easy.

History

View the changelog

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help:

To get started with development:

$ git clone https://github.com/lalusaud/makuri.git
$ cd makuri
$ bundle install
$ bundle exec rake test