0.0
Repository is gone
No release in over 3 years
Generic Web crawler with a DSL that parses event-related data from web pages
2021
2022
2023
2024
2025
2026
0.0
No commit activity in last 3 years
No release in over 3 years
Web crawler with JSON-based DSL and EventMachine-powered page fetching
2021
2022
2023
2024
2025
2026
0.0
No commit activity in last 3 years
No release in over 3 years
There's a lot of open issues
Simple site crawler using Capybara
2021
2022
2023
2024
2025
2026
0.0
No commit activity in last 3 years
No release in over 3 years
A demo of Web Crawler using arb-crawler
2021
2022
2023
2024
2025
2026
0.0
Repository is archived
No commit activity in last 3 years
No release in over 3 years
A periodic crawler that fetches the latest CVE additions, parses them, and filters them
2021
2022
2023
2024
2025
2026
0.0
No release in over 3 years
Low commit activity in last 3 years
Easy way to enable AdSense crawler to login and see private or custom pages in your rails application. Basically one custom login filter. Gem enables you to easily slightly increase revenues from Google AdSense/AdWords. It makes it easy to enable crawling on private pages and so get better target...
2021
2022
2023
2024
2025
2026
0.0
No commit activity in last 3 years
No release in over 3 years
The Baidu Crawler is to crawl data with your demmand
2021
2022
2023
2024
2025
2026
0.0
No commit activity in last 3 years
No release in over 3 years
A generic web crawler that doesn't crawl outside URLs.
2021
2022
2023
2024
2025
2026
0.05
No release in over 3 years
Low commit activity in last 3 years
There's a lot of open issues
Asynchronous web crawler, scraper and file harvester
2021
2022
2023
2024
2025
2026
0.05
No release in over 3 years
Low commit activity in last 3 years
There's a lot of open issues
Crawl instagram photos, posts and videos for download.
2021
2022
2023
2024
2025
2026
0.05
No commit activity in last 3 years
No release in over 3 years
There's a lot of open issues
An easy to use distributed web-crawler framework based on Redis
2021
2022
2023
2024
2025
2026
0.02
No commit activity in last 3 years
No release in over 3 years
Arachnid is a web crawler that relies on Bloom Filters to efficiently store visited urls and Typhoeus to avoid the overhead of Mechanize when crawling every page on a domain.
2021
2022
2023
2024
2025
2026
0.02
No release in over 3 years
Low commit activity in last 3 years
validate-website is a web crawler for checking the markup validity with XML Schema / DTD and not found urls.
2021
2022
2023
2024
2025
2026
0.02
No commit activity in last 3 years
No release in over 3 years
Rack Middleware adhering to the Google Ajax Crawling Scheme, using a headless browser to render JS heavy pages and serve a dom snapshot of the rendered state to a requesting search engine.
2021
2022
2023
2024
2025
2026
0.02
Low commit activity in last 3 years
No release in over a year
Post URLs to Wayback Machine (Internet Archive), using a crawler, from Sitemap(s) or a list of URLs.
2021
2022
2023
2024
2025
2026
0.02
No commit activity in last 3 years
No release in over 3 years
Ruby web crawler using PhantomJS
2021
2022
2023
2024
2025
2026