Category

Web Content Scrapers

This category does not have a description yet. You can add one on github!

anemone

0.99
Anemone web-spider framework
 Popularity
Downloads
548,301
Stars
1,609
Forks
312
Watchers
66
 Releases
Current version
0.7.2
Total releases
23
First release
Latest release
 Activity
Pull Request Acceptance Rate
4%
Average date of last 50 commits
more than 2 years ago
Reverse Depencencies
29

metainspector

0.47
MetaInspector lets you scrape a web page and get its links, images, texts, meta tags...
 Popularity
Downloads
337,277
Stars
794
Forks
131
Watchers
24
 Releases
Current version
5.4.2
Total releases
99
First release
Latest release
 Activity
Issue Closure Rate
87%
Pull Request Acceptance Rate
71%
Average date of last 50 commits
within last 2 years
Reverse Depencencies
12

pismo

0.35
Pismo extracts and retrieves content-related metadata from HTML pages - you can use the resulting data in an organized way, such as a summary/first paragraph, body text, keywords, RSS feed URL, favicon, etc.
 Popularity
Downloads
102,273
Stars
729
Forks
91
Watchers
29
 Releases
Current version
0.7.4
Total releases
13
First release
Latest release
 Activity
Issue Closure Rate
33%
Pull Request Acceptance Rate
66%
Average date of last 50 commits
more than 2 years ago
Reverse Depencencies
3

link_thumbnailer

0.27
Ruby gem generating thumbnail images from a given URL.
 Popularity
Downloads
126,406
Stars
402
Forks
92
Watchers
13
 Releases
Current version
3.3.2
Total releases
49
First release
Latest release
 Activity
Issue Closure Rate
83%
Pull Request Acceptance Rate
80%
Average date of last 50 commits
within last 2 years
Reverse Depencencies
0

data_miner

0.17
Download, pull out of a ZIP/TAR/GZ/BZ2 archive, parse, correct, and import XLS, ODS, XML, CSV, HTML, etc. into your ActiveRecord models. Uses Upsert internally for speed.
 Popularity
Downloads
290,332
Stars
289
Forks
20
Watchers
13
 Releases
Current version
3.0.0
Total releases
115
First release
Latest release
 Activity
Issue Closure Rate
63%
Pull Request Acceptance Rate
100%
Average date of last 50 commits
more than 2 years ago
Reverse Depencencies
2

cobweb

0.17
Cobweb is a web crawler that can use resque to cluster crawls to quickly crawl extremely large sites which is much more performant than multi-threaded crawlers. It is also a standalone crawler that has a sophisticated statistics monitoring interface to monitor the progress of the crawls.
 Popularity
Downloads
179,337
Stars
217
Forks
49
Watchers
7
 Releases
Current version
1.1.0
Total releases
92
First release
Latest release
 Activity
Issue Closure Rate
52%
Pull Request Acceptance Rate
83%
Average date of last 50 commits
more than 2 years ago
Reverse Depencencies
4

sinew

0.07
Crawl web sites easily using ruby recipes, with caching and nokogiri.
 Popularity
Downloads
21,026
Stars
181
Forks
13
Watchers
4
 Releases
Current version
1.0.4
Total releases
5
First release
Latest release
 Activity
Issue Closure Rate
62%
Pull Request Acceptance Rate
50%
Average date of last 50 commits
more than 2 years ago
Reverse Depencencies
0

fletcher

0.05
Easily fetch product information from third party websites such as Amazon, Steam, eBay, etc.
 Popularity
Downloads
44,017
Stars
54
Forks
16
Watchers
5
 Releases
Current version
0.6.9
Total releases
18
First release
Latest release
 Activity
Issue Closure Rate
75%
Pull Request Acceptance Rate
71%
Average date of last 50 commits
more than 2 years ago
Reverse Depencencies
0

docparser

0.01
DocParser is a Ruby Gem for webscraping
 Popularity
Downloads
15,181
Stars
32
Forks
1
Watchers
2
 Releases
Current version
0.2.3
Total releases
10
First release
Latest release
 Activity
Issue Closure Rate
100%
Average date of last 50 commits
more than 2 years ago
Reverse Depencencies
1

horsefield

0.01
It's a scraper
 Popularity
Downloads
41,238
Stars
2
Forks
0
Watchers
1
 Releases
Current version
0.4.67
Total releases
32
First release
Latest release
 Activity
Average date of last 50 commits
more than 2 years ago
Reverse Depencencies
0

url_scraper

0.01
A simple plugin for extracting information from url entered by user (Something like what facebook does). This gem is built on top of opengraph gem created by michael bleigh.
 Popularity
Downloads
4,600
Stars
9
Forks
2
Watchers
4
 Releases
Current version
0.0.5
Total releases
3
First release
Latest release
 Activity
Issue Closure Rate
0%
Average date of last 50 commits
more than 2 years ago
Reverse Depencencies
0

wiki-api

0.0
MediaWiki API and Page content parser for Headlines (nested), TextBlocks, ListItems, and Links.
 Popularity
Downloads
4,605
Stars
7
Forks
0
Watchers
1
 Releases
Current version
0.1.0
Total releases
3
First release
Latest release
 Activity
Issue Closure Rate
100%
Pull Request Acceptance Rate
100%
Average date of last 50 commits
more than 2 years ago
Reverse Depencencies
0