Category

Web Content Scrapers

This category does not have a description yet. You can add one on github!

anemone

0.84
Anemone web-spider framework
 Popularity
Downloads
588,617
Stars
1,612
Forks
311
Watchers
66
 Releases
Current version
0.7.2
Total releases
23
First release
Latest release
 Activity
Pull Request Acceptance Rate
4%
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
29

metainspector

0.42
MetaInspector lets you scrape a web page and get its links, images, texts, meta tags...
 Popularity
Downloads
419,892
Stars
818
Forks
134
Watchers
21
 Releases
Current version
5.6.0
Total releases
102
First release
Latest release
 Activity
Issue Closure Rate
94%
Pull Request Acceptance Rate
71%
Average date of last 50 commits
within last 2 years
Reverse Dependencies
12

wombat

0.41
Generic Web crawler with a DSL that parses structured data from web pages
 Popularity
Downloads
97,869
Stars
1,111
Forks
110
Watchers
52
 Releases
Current version
2.7.0
Total releases
30
First release
Latest release
 Activity
Issue Closure Rate
59%
Pull Request Acceptance Rate
79%
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
3

pismo

0.3
Pismo extracts and retrieves content-related metadata from HTML pages - you can use the resulting data in an organized way, such as a summary/first paragraph, body text, keywords, RSS feed URL, favicon, etc.
 Popularity
Downloads
108,601
Stars
736
Forks
90
Watchers
28
 Releases
Current version
0.7.4
Total releases
13
First release
Latest release
 Activity
Issue Closure Rate
33%
Pull Request Acceptance Rate
66%
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
3

link_thumbnailer

0.24
Ruby gem generating thumbnail images from a given URL.
 Popularity
Downloads
146,672
Stars
419
Forks
96
Watchers
13
 Releases
Current version
3.3.2
Total releases
49
First release
Latest release
 Activity
Issue Closure Rate
83%
Pull Request Acceptance Rate
78%
Average date of last 50 commits
within last 2 years
Reverse Dependencies
0

data_miner

0.14
Download, pull out of a ZIP/TAR/GZ/BZ2 archive, parse, correct, and import XLS, ODS, XML, CSV, HTML, etc. into your ActiveRecord models. Uses Upsert internally for speed.
 Popularity
Downloads
298,447
Stars
288
Forks
20
Watchers
13
 Releases
Current version
3.0.0
Total releases
115
First release
Latest release
 Activity
Issue Closure Rate
63%
Pull Request Acceptance Rate
100%
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
2

cobweb

0.13
Cobweb is a web crawler that can use resque to cluster crawls to quickly crawl extremely large sites which is much more performant than multi-threaded crawlers. It is also a standalone crawler that has a sophisticated statistics monitoring interface to monitor the progress of the crawls.
 Popularity
Downloads
184,402
Stars
213
Forks
46
Watchers
7
 Releases
Current version
1.1.0
Total releases
92
First release
Latest release
 Activity
Issue Closure Rate
52%
Pull Request Acceptance Rate
83%
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
4

sinew

0.06
Crawl web sites easily using ruby recipes, with caching and nokogiri.
 Popularity
Downloads
24,197
Stars
181
Forks
12
Watchers
4
 Releases
Current version
2.0.4
Total releases
10
First release
Latest release
 Activity
Issue Closure Rate
100%
Pull Request Acceptance Rate
66%
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
0

fletcher

0.04
Easily fetch product information from third party websites such as Amazon, Steam, eBay, etc.
 Popularity
Downloads
46,788
Stars
54
Forks
16
Watchers
5
 Releases
Current version
0.6.9
Total releases
18
First release
Latest release
 Activity
Issue Closure Rate
75%
Pull Request Acceptance Rate
71%
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
0

docparser

0.01
DocParser is a Ruby Gem for webscraping
 Popularity
Downloads
15,780
Stars
33
Forks
1
Watchers
1
 Releases
Current version
0.2.3
Total releases
10
First release
Latest release
 Activity
Issue Closure Rate
100%
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
1

horsefield

0.01
It's a scraper
 Popularity
Downloads
43,562
Stars
2
Forks
0
Watchers
1
 Releases
Current version
0.6.0
Total releases
36
First release
Latest release
 Activity
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
0

url_scraper

0.01
A simple plugin for extracting information from url entered by user (Something like what facebook does). This gem is built on top of opengraph gem created by michael bleigh.
 Popularity
Downloads
4,795
Stars
9
Forks
2
Watchers
4
 Releases
Current version
0.0.5
Total releases
3
First release
Latest release
 Activity
Issue Closure Rate
0%
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
0

wiki-api

0.0
MediaWiki API and Page content parser for Headlines (nested), TextBlocks, ListItems, and Links.
 Popularity
Downloads
4,778
Stars
7
Forks
0
Watchers
1
 Releases
Current version
0.1.0
Total releases
3
First release
Latest release
 Activity
Issue Closure Rate
100%
Pull Request Acceptance Rate
100%
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
0

arachnid2

0.0
A simple, fast web crawler
 Popularity
Downloads
1,103
Stars
1
Forks
0
Watchers
1
 Releases
Current version
0.1.4
Total releases
5
First release
Latest release
 Activity
Issue Closure Rate
100%
Pull Request Acceptance Rate
100%
Average date of last 50 commits
within last year
Reverse Dependencies
0