Category

Web Content Scrapers

This category does not have a description yet. You can add one on github!

anemone

0.88
Anemone web-spider framework
 Popularity
Downloads
576,736
Stars
1,613
Forks
312
Watchers
66
 Releases
Current version
0.7.2
Total releases
23
First release
Latest release
 Activity
Pull Request Acceptance Rate
4%
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
29

metainspector

0.44
MetaInspector lets you scrape a web page and get its links, images, texts, meta tags...
 Popularity
Downloads
385,965
Stars
812
Forks
135
Watchers
21
 Releases
Current version
5.5.0
Total releases
101
First release
Latest release
 Activity
Issue Closure Rate
85%
Pull Request Acceptance Rate
70%
Average date of last 50 commits
within last 2 years
Reverse Dependencies
12

wombat

0.43
Generic Web crawler with a DSL that parses structured data from web pages
 Popularity
Downloads
95,607
Stars
1,101
Forks
113
Watchers
52
 Releases
Current version
2.7.0
Total releases
30
First release
Latest release
 Activity
Issue Closure Rate
60%
Pull Request Acceptance Rate
79%
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
3

pismo

0.31
Pismo extracts and retrieves content-related metadata from HTML pages - you can use the resulting data in an organized way, such as a summary/first paragraph, body text, keywords, RSS feed URL, favicon, etc.
 Popularity
Downloads
106,266
Stars
735
Forks
90
Watchers
28
 Releases
Current version
0.7.4
Total releases
13
First release
Latest release
 Activity
Issue Closure Rate
33%
Pull Request Acceptance Rate
66%
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
3

link_thumbnailer

0.25
Ruby gem generating thumbnail images from a given URL.
 Popularity
Downloads
140,432
Stars
417
Forks
96
Watchers
13
 Releases
Current version
3.3.2
Total releases
49
First release
Latest release
 Activity
Issue Closure Rate
83%
Pull Request Acceptance Rate
78%
Average date of last 50 commits
within last 2 years
Reverse Dependencies
0

data_miner

0.15
Download, pull out of a ZIP/TAR/GZ/BZ2 archive, parse, correct, and import XLS, ODS, XML, CSV, HTML, etc. into your ActiveRecord models. Uses Upsert internally for speed.
 Popularity
Downloads
295,623
Stars
288
Forks
20
Watchers
13
 Releases
Current version
3.0.0
Total releases
115
First release
Latest release
 Activity
Issue Closure Rate
63%
Pull Request Acceptance Rate
100%
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
2

cobweb

0.15
Cobweb is a web crawler that can use resque to cluster crawls to quickly crawl extremely large sites which is much more performant than multi-threaded crawlers. It is also a standalone crawler that has a sophisticated statistics monitoring interface to monitor the progress of the crawls.
 Popularity
Downloads
182,678
Stars
217
Forks
49
Watchers
7
 Releases
Current version
1.1.0
Total releases
92
First release
Latest release
 Activity
Issue Closure Rate
52%
Pull Request Acceptance Rate
83%
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
4

sinew

0.06
Crawl web sites easily using ruby recipes, with caching and nokogiri.
 Popularity
Downloads
23,517
Stars
181
Forks
12
Watchers
4
 Releases
Current version
2.0.4
Total releases
10
First release
Latest release
 Activity
Issue Closure Rate
100%
Pull Request Acceptance Rate
66%
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
0

fletcher

0.04
Easily fetch product information from third party websites such as Amazon, Steam, eBay, etc.
 Popularity
Downloads
45,962
Stars
54
Forks
16
Watchers
5
 Releases
Current version
0.6.9
Total releases
18
First release
Latest release
 Activity
Issue Closure Rate
75%
Pull Request Acceptance Rate
71%
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
0

docparser

0.01
DocParser is a Ruby Gem for webscraping
 Popularity
Downloads
15,553
Stars
33
Forks
1
Watchers
1
 Releases
Current version
0.2.3
Total releases
10
First release
Latest release
 Activity
Issue Closure Rate
100%
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
1

url_scraper

0.01
A simple plugin for extracting information from url entered by user (Something like what facebook does). This gem is built on top of opengraph gem created by michael bleigh.
 Popularity
Downloads
4,726
Stars
9
Forks
2
Watchers
4
 Releases
Current version
0.0.5
Total releases
3
First release
Latest release
 Activity
Issue Closure Rate
0%
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
0

horsefield

0.01
It's a scraper
 Popularity
Downloads
42,843
Stars
2
Forks
0
Watchers
1
 Releases
Current version
0.6.0
Total releases
36
First release
Latest release
 Activity
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
0

arachnid2

0.0
A simple, fast web crawler
 Popularity
Downloads
768
Stars
1
Forks
0
Watchers
1
 Releases
Current version
0.1.3
Total releases
4
First release
Latest release
 Activity
Issue Closure Rate
71%
Pull Request Acceptance Rate
100%
Average date of last 50 commits
within last 3 months
Reverse Dependencies
0

wiki-api

0.0
MediaWiki API and Page content parser for Headlines (nested), TextBlocks, ListItems, and Links.
 Popularity
Downloads
4,715
Stars
7
Forks
0
Watchers
1
 Releases
Current version
0.1.0
Total releases
3
First release
Latest release
 Activity
Issue Closure Rate
100%
Pull Request Acceptance Rate
100%
Average date of last 50 commits
more than 2 years ago
Reverse Dependencies
0