Categories
Category results are hidden when using a custom project result order
0.26
Voight-Kampff detects bots, spiders, crawlers and replicants
2019
2020
2021
2022
2023
2024
0.07
CrawlerDetect is a library to detect bots/crawlers via the user agent
2019
2020
2021
2022
2023
2024
0.13
Cobweb is a web crawler that can use resque to cluster crawls to quickly crawl extremely large sites which is much more performant than multi-threaded crawlers. It is also a standalone crawler that has a sophisticated statistics monitoring interface to monitor the progress of the crawls.
2019
2020
2021
2022
2023
2024
0.01
Gem for crawling data from external sources
2019
2020
2021
2022
2023
2024
0.55
Generic Web crawler with a DSL that parses structured data from web pages
2019
2020
2021
2022
2023
2024
0.02
is_crawler does exactly what you might think it does: determine if the supplied string matches a known crawler or bot.
2019
2020
2021
2022
2023
2024
0.03
validate-website is a web crawler for checking the markup validity with XML Schema / DTD and not found urls.
2019
2020
2021
2022
2023
2024
0.0
初级开发工程师,基于 http 写的爬虫扩展包。请不要随意下载里面有很多坑。
2019
2020
2021
2022
2023
2024
0.0
This gem helps Crawler Writers to interact with the PromoQui REST API
2019
2020
2021
2022
2023
2024
0.02
Ruby web crawler using PhantomJS
2019
2020
2021
2022
2023
2024
0.08
Asynchronous web crawler, scraper and file harvester
2019
2020
2021
2022
2023
2024
0.0
Cangrejo lets you consume crabfarm crawlers using a simple DSL
2019
2020
2021
2022
2023
2024
0.0
Dead simple yet powerful Ruby crawler for easy parallel crawling with support for an anonymity.
2019
2020
2021
2022
2023
2024
0.0
Simple little website crawler.
2019
2020
2021
2022
2023
2024
0.0
Show DMM and DMM.R18's crawled data. e.g. ranking
2019
2020
2021
2022
2023
2024
0.07
An easy to use distributed web-crawler framework based on Redis
2019
2020
2021
2022
2023
2024
0.0
Ruby web crawler to access omelete informations
2019
2020
2021
2022
2023
2024
0.03
Arachnid is a web crawler that relies on Bloom Filters to efficiently store visited urls and Typhoeus to avoid the overhead of Mechanize when crawling every page on a domain.
2019
2020
2021
2022
2023
2024
0.01
Website crawler and fulltext indexer.
2019
2020
2021
2022
2023
2024
0.0
Iudex is a general purpose web crawler and feed processor in ruby/java. The iudex-da gem provides a PostgreSQL-based content meta-data store and work priority queue.
2019
2020
2021
2022
2023
2024