Project

html2json

0.0
No commit activity in last 3 years
No release in over 3 years
Gem to scrape a webpage and renders into JSON.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Development

~> 1.5
>= 0

Runtime

 Project Readme

Gem Version

html2json

A ruby gem to scrape a webpage and renders required dom elements into a JSON. Use with Rails.

Usage:

Simple

In a Controller,

news = Html2Json::Web.new("http://www.theguardian.com/uk")
render :json => news.pick("//div[@id='global-nav']//a")

output

    {
    "response": [
        "Subway to create 13,000 jobs as it doubles outlets in UK and Ireland",
        "South Sudan failed by misjudgment of international community, says UN chief",
        "Pensions opt-out should be scrapped, says thinktank"
    ]
}

You can have arbitary key instead of the default "response" key.

news.pick("//ul[@id='ticker']//li//a", 'breaking_news')

output

{
    "breaking_news": [
        "Subway to create 13,000 jobs as it doubles outlets in UK and Ireland",
        "South Sudan failed by misjudgment of international community, says UN chief",
        "Pensions opt-out should be scrapped, says thinktank"
        ]
    }

Pick multiple parts from a webpage.

news = Html2Json::Web.new("http://www.theguardian.com/uk")
news.pick("//ul[@id='ticker']//li//a", 'breaking_news')
news.pick("//div[@id='global-nav']//a", 'news_categories')
render :json => news.render

output

{
    "breaking_news": [
        "Subway to create 13,000 jobs as it doubles outlets in UK and Ireland",
        "South Sudan failed by misjudgment of international community, says UN chief",
        "Pensions opt-out should be scrapped, says thinktank"
    ],
    "news_categories": [
        "News",
        "Sport",
        "Comment",
        "Culture"
    ]
}

http://www.freeformatter.com/xpath-tester.html can be used to test xpath expressions.

This is my first gem. Feedbacks will be appreciated.