Categories

Category results are hidden when using a custom project result order
0.0
No release in over 3 years
Rails Analyzer Tools contains Bench, a simple web page benchmarker, Crawler, a tool for beating up on web sites, RailsStat, a tool for monitoring Rails web sites, and IOTail, a tail(1) method for Ruby IOs.
2019
2020
2021
2022
2023
2024
0.01
No commit activity in last 3 years
No release in over 3 years
Website crawler and fulltext indexer.
2019
2020
2021
2022
2023
2024
0.0
No release in over 3 years
livedoor-feeddiscover performs feed autodiscovery using the livedoor Feed Discover API. livedoor Feed Discover API find a Atom/RSS feed(s) from the livedoor Reader crawler database. So, livedoor-feeddiscover do not access the target URL.
2019
2020
2021
2022
2023
2024
0.0
No release in over 3 years
The SimpleCrawler module is a library for crawling web sites. The crawler provides comprehensive data from the page crawled which can be used for page analysis, indexing, accessibility checks etc. Restrictions can be specified to limit crawling of binary files.
2019
2020
2021
2022
2023
2024
0.0
No release in over 3 years
RegexpCrawler is a Ruby library for crawl data from website using regular expression.
2019
2020
2021
2022
2023
2024
0.03
Low commit activity in last 3 years
No release in over a year
validate-website is a web crawler for checking the markup validity with XML Schema / DTD and not found urls.
2019
2020
2021
2022
2023
2024
0.01
No commit activity in last 3 years
No release in over 3 years
BFS webcrawler that implements Observable
2019
2020
2021
2022
2023
2024
0.0
No commit activity in last 3 years
No release in over 3 years
Discovery Mission is an easy-to-use website crawler. Use it for generating sitemaps.
2019
2020
2021
2022
2023
2024
0.01
No commit activity in last 3 years
No release in over 3 years
A simple directory crawler DSL.
2019
2020
2021
2022
2023
2024
0.13
No release in over 3 years
Low commit activity in last 3 years
There's a lot of open issues
Cobweb is a web crawler that can use resque to cluster crawls to quickly crawl extremely large sites which is much more performant than multi-threaded crawlers. It is also a standalone crawler that has a sophisticated statistics monitoring interface to monitor the progress of the crawls.
2019
2020
2021
2022
2023
2024
0.0
Repository is gone
No release in over 3 years
A set of classes for dealing with options. It includes a crawler for Yahoo!Finance.
2019
2020
2021
2022
2023
2024
0.0
No release in over 3 years
Iudex is a general purpose web crawler and feed processor in ruby/java. The iudex-filter gem contains a fundamental filtering/chain of responsbility sub-system.
2019
2020
2021
2022
2023
2024
0.0
No release in over 3 years
Iudex is a general purpose web crawler and feed processor in ruby/java. This gem is an rjack-httpclient-3 based implementation of the iudex-http interfaces.
2019
2020
2021
2022
2023
2024
0.0
No release in over 3 years
Iudex is a general purpose web crawler and feed processor in ruby/java. The iudex-barc gem contains support for the BARC Basic ARChive format.
2019
2020
2021
2022
2023
2024