Project

ferrexi

0.0
Low commit activity in last 3 years
No release in over a year
A wrapper over the Ferrum gem to return a Rexle document.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Runtime

~> 0.7, >= 0.7
~> 0.7, >= 0.7.0
 Project Readme

Introducing the Ferrexi gem

The Ferrexi gem is intended for scraping a website which relies upon javascript to render the content.

require 'ferrexi'

url = 'https://www.arcgis.com/apps/opsdashboard/index.html#/ae5dda8f86814ae99dde905d2a9070ae'

doc = Ferrexi.new(url).to_doc
puts doc.xml pretty: true
File.write '/tmp/ferrum2.html', doc.xml(pretty: true)

svg  = doc.root.xpath('//svg')

total_cases = svg[1].text('text')
#=> 1.391

new_cases = svg[2].text('g[2]/svg/text')
#=> 330

total_deaths = svg[4].text('g[2]/svg/text')
#=> 35

In the above example the website COVID-19 Mobile Dashboard is scraped for the UK cases, including total, new cases, and total deaths.

Resources

ferrexi webscraper scraper