Project

yasf

0.0
No commit activity in last 3 years
No release in over 3 years
HTML scraping to write maintainable rules to extract data from HTML content.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

>= 0
>= 0

Runtime

= 0.10.0
= 1.5.5
 Project Readme

Yasf

Build Status

Web scraper

Usage:

gem install yasf

Scraping a page:

The simplest way to use yasf is by calling Yasf.crawl and passing it a block:

  require 'yasf'

  result = Yasf.crawl do
    base_url "http://www.wowebook.com"

    property :page_title, xpath: '/html/head/title'

    collection :books, xpath: '//*[@id="content"]/div/article' do

      property :title, xpath: 'header/h2/a/@title'do |data|
        data.to_s.upcase
      end

      property :description, xpath: 'div/p'

      property :download, xpath: 'div/p/a' do
        field :href
        field :title
      end
    end
  end

  puts result.page_title

  result.books.each do |book|
    puts "Book: #{book.title} -> #{book.description}"
  end

Copyright

Copyright (c) 2014 Algonauti