0.0
No release in over 3 years
Low commit activity in last 3 years
webget gem - a web (go get) crawler incl. web cache
2019
2020
2021
2022
2023
2024
0.0
No release in over 3 years
Iudex is a general purpose web crawler and feed processor in ruby/java. The iudex-core gem contains core facilities and notably, does not contain such facilities as database-backed state management.
2019
2020
2021
2022
2023
2024
0.0
No release in over 3 years
Pantopoda is a web crawler that visits all links on a given domain that's fast and effective.
2019
2020
2021
2022
2023
2024
0.02
No commit activity in last 3 years
No release in over 3 years
is_crawler does exactly what you might think it does: determine if the supplied string matches a known crawler or bot.
2019
2020
2021
2022
2023
2024
0.0
No commit activity in last 3 years
No release in over 3 years
This file crawler helps to decect if there are new files in a directory.
2019
2020
2021
2022
2023
2024
0.0
No release in over 3 years
Simple async HTTP crawler based on em-synchrony
2019
2020
2021
2022
2023
2024
0.0
No release in over 3 years
Iudex is a general purpose web crawler and feed processor in ruby/java. The iudex-simhash gem contains support for generation and searching over simhash fingerprints
2019
2020
2021
2022
2023
2024
0.0
No release in over 3 years
Iudex is a general purpose web crawler and feed processor in ruby/java. The iudex-filter gem contains a fundamental filtering/chain of responsbility sub-system.
2019
2020
2021
2022
2023
2024
0.0
Repository is archived
No commit activity in last 3 years
No release in over 3 years
Using paperclip to generate images from sensible attributes like e-mails and telephone numbers, in order to reduce crawler's success
2019
2020
2021
2022
2023
2024
0.0
No release in over 3 years
Iudex is a general purpose web crawler and feed processor in ruby/java. The iudex-http gem contains and http client agnostic abstraction layer.
2019
2020
2021
2022
2023
2024
0.0
Repository is archived
No release in over a year
Retrieves a list of URLs to seed the crawler by publishing them to a RabbitMQ exchange.
2019
2020
2021
2022
2023
2024
0.0
No release in over 3 years
Iudex is a general purpose web crawler and feed processor in ruby/java. This gem is an rjack-httpclient-3 based implementation of the iudex-http interfaces.
2019
2020
2021
2022
2023
2024
0.0
No release in over 3 years
Iudex is a general purpose web crawler and feed processor in ruby/java. The iudex-rome gems is an adaption of rjack-rome for feed parsing in Iudex.
2019
2020
2021
2022
2023
2024
0.0
No release in over 3 years
This little library helps people download images from different subs much easier. It's actually like a crawler for the images posted on a subreddit. Actually, it's a great tool to have your favorite memes locally!
2019
2020
2021
2022
2023
2024
0.0
Repository is archived
No commit activity in last 3 years
No release in over 3 years
This crawler will use my personnal scraper named 'RecipeScraper' to dowload recipes data from Marmiton, 750g or cuisineaz
2019
2020
2021
2022
2023
2024
0.0
No release in over 3 years
The SimpleCrawler module is a library for crawling web sites. The crawler provides comprehensive data from the page crawled which can be used for page analysis, indexing, accessibility checks etc. Restrictions can be specified to limit crawling of binary files.
2019
2020
2021
2022
2023
2024
0.0
No commit activity in last 3 years
No release in over 3 years
A generic web crawler that doesn't crawl outside URLs.
2019
2020
2021
2022
2023
2024
0.0
No release in over 3 years
Iudex is a general purpose web crawler and feed processor in ruby/java. This gem is an rjack-async-httpclient based implementation of the iudex-http interfaces.
2019
2020
2021
2022
2023
2024