0.0
Repository is archived
No commit activity in last 3 years
No release in over 3 years
There's a lot of open issues
A web crawler using Ruby and Redis.
2020
2021
2022
2023
2024
2025
0.0
No commit activity in last 3 years
No release in over 3 years
Website crawler harvesting e-mails. Uses Sidekiq and Typhoeus.
2020
2021
2022
2023
2024
2025
0.0
No release in over 3 years
Another Web crawler running with Amazon SQS and ElastiCache(Redis)
2020
2021
2022
2023
2024
2025
0.0
No commit activity in last 3 years
No release in over 3 years
Rack middleware that executes javascript before serving pages to crawlers.
2020
2021
2022
2023
2024
2025
0.0
No release in over 3 years
Iudex is a general purpose web crawler and feed processor in ruby/java. The iudex-barc gem contains support for the BARC Basic ARChive format.
2020
2021
2022
2023
2024
2025
0.0
No release in over 3 years
Iudex is a general purpose web crawler and feed processor in ruby/java. The iudex-simhash gem contains support for generation and searching over simhash fingerprints
2020
2021
2022
2023
2024
2025
0.0
No release in over 3 years
Iudex is a general purpose web crawler and feed processor in ruby/java. The iudex-html gem contains filters for HTML parsing, filtering, exracting text and links.
2020
2021
2022
2023
2024
2025
0.0
Repository is archived
No commit activity in last 3 years
No release in over 3 years
Simple Twitter crawler
2020
2021
2022
2023
2024
2025
0.0
No release in over 3 years
Iudex is a general purpose web crawler and feed processor in ruby/java. The iudex-core gem contains core facilities and notably, does not contain such facilities as database-backed state management.
2020
2021
2022
2023
2024
2025
0.0
Repository is archived
No commit activity in last 3 years
No release in over 3 years
Dead simple yet powerful Ruby crawler for easy parallel crawling with support for an anonymity.
2020
2021
2022
2023
2024
2025
0.0
No release in over 3 years
check how many links are available inside the website
2020
2021
2022
2023
2024
2025
0.0
No release in over 3 years
livedoor-feeddiscover performs feed autodiscovery using the livedoor Feed Discover API. livedoor Feed Discover API find a Atom/RSS feed(s) from the livedoor Reader crawler database. So, livedoor-feeddiscover do not access the target URL.
2020
2021
2022
2023
2024
2025
0.0
Repository is gone
No release in over 3 years
A set of classes for dealing with options. It includes a crawler for Yahoo!Finance.
2020
2021
2022
2023
2024
2025
0.0
No release in over 3 years
Multithreaded web crawler with transparent DSL and requests caching.
2020
2021
2022
2023
2024
2025
0.0
No commit activity in last 3 years
No release in over 3 years
A client for the PageMunch web crawler API
2020
2021
2022
2023
2024
2025
0.0
No release in over 3 years
Pantopoda is a web crawler that visits all links on a given domain that's fast and effective.
2020
2021
2022
2023
2024
2025
0.0
No commit activity in last 3 years
No release in over 3 years
Bulbasaur is a helper for crawler operations used in Pread.ly
2020
2021
2022
2023
2024
2025