Categories

Category results are hidden when using a custom project result order
0.02
No commit activity in last 3 years
No release in over 3 years
Ruby web crawler using PhantomJS
2019
2020
2021
2022
2023
2024
0.0
Repository is archived
No release in over a year
Retrieves a list of URLs to seed the crawler by publishing them to a RabbitMQ exchange.
2019
2020
2021
2022
2023
2024
0.55
Low commit activity in last 3 years
There's a lot of open issues
No release in over a year
Generic Web crawler with a DSL that parses structured data from web pages
2019
2020
2021
2022
2023
2024
0.01
Repository is archived
No commit activity in last 3 years
No release in over 3 years
This is a crawler framework.
2019
2020
2021
2022
2023
2024
0.0
No commit activity in last 3 years
No release in over 3 years
Bulbasaur is a helper for crawler operations used in Pread.ly
2019
2020
2021
2022
2023
2024
0.0
No commit activity in last 3 years
No release in over 3 years
This gem helps Crawler Writers to interact with the PromoQui REST API
2019
2020
2021
2022
2023
2024
0.26
Low commit activity in last 3 years
No release in over a year
Voight-Kampff detects bots, spiders, crawlers and replicants
2019
2020
2021
2022
2023
2024
0.07
No commit activity in last 3 years
No release in over 3 years
There's a lot of open issues
An easy to use distributed web-crawler framework based on Redis
2019
2020
2021
2022
2023
2024
0.13
No release in over 3 years
Low commit activity in last 3 years
There's a lot of open issues
Cobweb is a web crawler that can use resque to cluster crawls to quickly crawl extremely large sites which is much more performant than multi-threaded crawlers. It is also a standalone crawler that has a sophisticated statistics monitoring interface to monitor the progress of the crawls.
2019
2020
2021
2022
2023
2024
0.03
No commit activity in last 3 years
No release in over 3 years
Rack Middleware adhering to the Google Ajax Crawling Scheme, using a headless browser to render JS heavy pages and serve a dom snapshot of the rendered state to a requesting search engine.
2019
2020
2021
2022
2023
2024
0.03
No release in over 3 years
Low commit activity in last 3 years
Post URLs to Wayback Machine (Internet Archive), using a crawler, from Sitemap(s) or a list of URLs.
2019
2020
2021
2022
2023
2024
0.08
No release in over 3 years
Low commit activity in last 3 years
There's a lot of open issues
Asynchronous web crawler, scraper and file harvester
2019
2020
2021
2022
2023
2024
0.08
No release in over 3 years
Low commit activity in last 3 years
There's a lot of open issues
Crawl instagram photos, posts and videos for download.
2019
2020
2021
2022
2023
2024
0.01
No commit activity in last 3 years
No release in over 3 years
JavaScript enabled web crawler kit
2019
2020
2021
2022
2023
2024
0.0
No commit activity in last 3 years
No release in over 3 years
A client for the PageMunch web crawler API
2019
2020
2021
2022
2023
2024
0.01
No commit activity in last 3 years
No release in over 3 years
Website crawler and fulltext indexer.
2019
2020
2021
2022
2023
2024
0.01
No release in over 3 years
Low commit activity in last 3 years
your friendly neighborhood web crawler
2019
2020
2021
2022
2023
2024
0.01
No commit activity in last 3 years
No release in over 3 years
render_static allows you to make your single-page apps (Backbone, Angular, etc) built on Rails SEO-friendly. It works by injecting a small rack middleware that will render pages as plain html, when the requester is one of the most common crawlers/bots out there (Google, Yahoo Baidu and Bing)
2019
2020
2021
2022
2023
2024
0.02
Repository is archived
No commit activity in last 3 years
No release in over 3 years
Cosmicrawler is crawler library for Ruby. It provides scalable asynchronous crawling by (http|file|etc) using EventMachine.
2019
2020
2021
2022
2023
2024
0.0
No commit activity in last 3 years
No release in over 3 years
There's a lot of open issues
Simple site crawler using Capybara
2019
2020
2021
2022
2023
2024