Categories
Category results are hidden when using a custom project result order
0.55
Generic Web crawler with a DSL that parses structured data from web pages
2019
2020
2021
2022
2023
2024
0.26
Voight-Kampff detects bots, spiders, crawlers and replicants
2019
2020
2021
2022
2023
2024
0.13
Cobweb is a web crawler that can use resque to cluster crawls to quickly crawl extremely large sites which is much more performant than multi-threaded crawlers. It is also a standalone crawler that has a sophisticated statistics monitoring interface to monitor the progress of the crawls.
2019
2020
2021
2022
2023
2024
0.08
Asynchronous web crawler, scraper and file harvester
2019
2020
2021
2022
2023
2024
0.08
Crawl instagram photos, posts and videos for download.
2019
2020
2021
2022
2023
2024
0.07
An easy to use distributed web-crawler framework based on Redis
2019
2020
2021
2022
2023
2024
0.07
An easy to use distributed web-crawler framework based on Redis
2019
2020
2021
2022
2023
2024
0.07
CrawlerDetect is a library to detect bots/crawlers via the user agent
2019
2020
2021
2022
2023
2024
0.03
Rack Middleware adhering to the Google Ajax Crawling Scheme, using a headless browser to render JS heavy pages and serve a dom snapshot of the rendered state to a requesting search engine.
2019
2020
2021
2022
2023
2024
0.03
Post URLs to Wayback Machine (Internet Archive), using a crawler, from Sitemap(s) or a list of URLs.
2019
2020
2021
2022
2023
2024
0.03
validate-website is a web crawler for checking the markup validity with XML Schema / DTD and not found urls.
2019
2020
2021
2022
2023
2024
0.03
Arachnid is a web crawler that relies on Bloom Filters to efficiently store visited urls and Typhoeus to avoid the overhead of Mechanize when crawling every page on a domain.
2019
2020
2021
2022
2023
2024
0.02
is_crawler does exactly what you might think it does: determine if the supplied string matches a known crawler or bot.
2019
2020
2021
2022
2023
2024
0.02
Ruby web crawler using PhantomJS
2019
2020
2021
2022
2023
2024
0.02
Cosmicrawler is crawler library for Ruby. It provides scalable asynchronous crawling by (http|file|etc) using EventMachine.
2019
2020
2021
2022
2023
2024
0.02
RegexpCrawler is a Ruby library for crawl data from website using regular expression.
2019
2020
2021
2022
2023
2024
0.01
This is a crawler framework.
2019
2020
2021
2022
2023
2024
0.01
JavaScript enabled web crawler kit
2019
2020
2021
2022
2023
2024
0.01
Website crawler and fulltext indexer.
2019
2020
2021
2022
2023
2024
0.01
render_static allows you to make your single-page apps (Backbone, Angular, etc) built on Rails SEO-friendly. It works by injecting a small rack middleware that will render pages as plain html, when the requester is one of the most common crawlers/bots out there (Google, Yahoo Baidu and Bing)
2019
2020
2021
2022
2023
2024