Category

Web Content Scrapers

This category does not have a description yet. You can add one on github!

anemone

0.93
Anemone web-spider framework
 Popularity
Downloads
563,254
Stars
1,612
Forks
312
Watchers
66
 Releases
Current version
0.7.2
Total releases
23
First release
Latest release
 Activity
Pull Request Acceptance Rate
4%
Average date of last 50 commits
more than 2 years ago
Reverse Depencencies
29

metainspector

0.45
MetaInspector lets you scrape a web page and get its links, images, texts, meta tags...
 Popularity
Downloads
358,650
Stars
800
Forks
132
Watchers
22
 Releases
Current version
5.5.0
Total releases
101
First release
Latest release
 Activity
Issue Closure Rate
86%
Pull Request Acceptance Rate
71%
Average date of last 50 commits
within last 2 years
Reverse Depencencies
12

pismo

0.33
Pismo extracts and retrieves content-related metadata from HTML pages - you can use the resulting data in an organized way, such as a summary/first paragraph, body text, keywords, RSS feed URL, favicon, etc.
 Popularity
Downloads
105,008
Stars
734
Forks
91
Watchers
28
 Releases
Current version
0.7.4
Total releases
13
First release
Latest release
 Activity
Issue Closure Rate
33%
Pull Request Acceptance Rate
66%
Average date of last 50 commits
more than 2 years ago
Reverse Depencencies
3

link_thumbnailer

0.25
Ruby gem generating thumbnail images from a given URL.
 Popularity
Downloads
133,493
Stars
411
Forks
94
Watchers
13
 Releases
Current version
3.3.2
Total releases
49
First release
Latest release
 Activity
Issue Closure Rate
83%
Pull Request Acceptance Rate
80%
Average date of last 50 commits
within last 2 years
Reverse Depencencies
0

data_miner

0.16
Download, pull out of a ZIP/TAR/GZ/BZ2 archive, parse, correct, and import XLS, ODS, XML, CSV, HTML, etc. into your ActiveRecord models. Uses Upsert internally for speed.
 Popularity
Downloads
294,613
Stars
289
Forks
20
Watchers
13
 Releases
Current version
3.0.0
Total releases
115
First release
Latest release
 Activity
Issue Closure Rate
63%
Pull Request Acceptance Rate
100%
Average date of last 50 commits
more than 2 years ago
Reverse Depencencies
2

cobweb

0.15
Cobweb is a web crawler that can use resque to cluster crawls to quickly crawl extremely large sites which is much more performant than multi-threaded crawlers. It is also a standalone crawler that has a sophisticated statistics monitoring interface to monitor the progress of the crawls.
 Popularity
Downloads
181,817
Stars
216
Forks
49
Watchers
7
 Releases
Current version
1.1.0
Total releases
92
First release
Latest release
 Activity
Issue Closure Rate
52%
Pull Request Acceptance Rate
83%
Average date of last 50 commits
more than 2 years ago
Reverse Depencencies
4

sinew

0.07
Crawl web sites easily using ruby recipes, with caching and nokogiri.
 Popularity
Downloads
23,330
Stars
182
Forks
13
Watchers
4
 Releases
Current version
2.0.4
Total releases
10
First release
Latest release
 Activity
Issue Closure Rate
100%
Pull Request Acceptance Rate
66%
Average date of last 50 commits
more than 2 years ago
Reverse Depencencies
0

fletcher

0.04
Easily fetch product information from third party websites such as Amazon, Steam, eBay, etc.
 Popularity
Downloads
45,758
Stars
54
Forks
16
Watchers
5
 Releases
Current version
0.6.9
Total releases
18
First release
Latest release
 Activity
Issue Closure Rate
75%
Pull Request Acceptance Rate
71%
Average date of last 50 commits
more than 2 years ago
Reverse Depencencies
0

docparser

0.01
DocParser is a Ruby Gem for webscraping
 Popularity
Downloads
15,442
Stars
33
Forks
1
Watchers
2
 Releases
Current version
0.2.3
Total releases
10
First release
Latest release
 Activity
Issue Closure Rate
100%
Average date of last 50 commits
more than 2 years ago
Reverse Depencencies
1

horsefield

0.01
It's a scraper
 Popularity
Downloads
42,077
Stars
2
Forks
0
Watchers
1
 Releases
Current version
0.5.68
Total releases
34
First release
Latest release
 Activity
Average date of last 50 commits
more than 2 years ago
Reverse Depencencies
0

url_scraper

0.01
A simple plugin for extracting information from url entered by user (Something like what facebook does). This gem is built on top of opengraph gem created by michael bleigh.
 Popularity
Downloads
4,696
Stars
9
Forks
2
Watchers
4
 Releases
Current version
0.0.5
Total releases
3
First release
Latest release
 Activity
Issue Closure Rate
0%
Average date of last 50 commits
more than 2 years ago
Reverse Depencencies
0

wiki-api

0.0
MediaWiki API and Page content parser for Headlines (nested), TextBlocks, ListItems, and Links.
 Popularity
Downloads
4,690
Stars
7
Forks
0
Watchers
1
 Releases
Current version
0.1.0
Total releases
3
First release
Latest release
 Activity
Issue Closure Rate
100%
Pull Request Acceptance Rate
100%
Average date of last 50 commits
more than 2 years ago
Reverse Depencencies
0