Categories

No matching categories were found
0.0
No commit activity in last 3 years
No release in over 3 years
== Medusa: a ruby crawler framework {rdoc-image:https://badge.fury.io/rb/medusa-crawler.svg}[https://rubygems.org/gems/medusa-crawler] rdoc-image:https://github.com/brutuscat/medusa-crawler/workflows/Ruby/badge.svg?event=push Medusa is a framework for the ruby language to crawl and collect usefu...
2020
2021
2022
2023
2024
2025
0.1
No release in over 3 years
Low commit activity in last 3 years
There's a lot of open issues
Cobweb is a web crawler that can use resque to cluster crawls to quickly crawl extremely large sites which is much more performant than multi-threaded crawlers. It is also a standalone crawler that has a sophisticated statistics monitoring interface to monitor the progress of the crawls.
2020
2021
2022
2023
2024
2025
0.07
Low commit activity in last 3 years
A long-lived project that still receives updates
CrawlerDetect is a library to detect bots/crawlers via the user agent
2020
2021
2022
2023
2024
2025
0.0
No release in over 3 years
Crawler Engine provides function of crawl all news from the customized website
2020
2021
2022
2023
2024
2025
0.42
Low commit activity in last 3 years
There's a lot of open issues
No release in over a year
Generic Web crawler with a DSL that parses structured data from web pages
2020
2021
2022
2023
2024
2025
0.23
Low commit activity in last 3 years
No release in over a year
Voight-Kampff detects bots, spiders, crawlers and replicants
2020
2021
2022
2023
2024
2025
0.01
No commit activity in last 3 years
No release in over 3 years
is_crawler does exactly what you might think it does: determine if the supplied string matches a known crawler or bot.
2020
2021
2022
2023
2024
2025
0.0
No commit activity in last 3 years
No release in over 3 years
Stupid crawler that looks for URLs on a given site. Result is saved as two CSV files one with found URLs and another with failed URLs.
2020
2021
2022
2023
2024
2025
0.0
No commit activity in last 3 years
No release in over 3 years
Web crawler help you with parse and collect data from the web
2020
2021
2022
2023
2024
2025
0.0
Repository is archived
No release in over a year
Retrieves a list of URLs to seed the crawler by publishing them to a RabbitMQ exchange.
2020
2021
2022
2023
2024
2025
0.0
No commit activity in last 3 years
No release in over 3 years
A demo of Web Crawler using arb-crawler
2020
2021
2022
2023
2024
2025
0.0
Repository is gone
No release in over 3 years
MurmuringSpider is a concise Twitter crawler. When we write a data-mining / text-mining application based on twitter timeline, we have to collect and store tweets first. I am irritated with writing such crawler repeatedly, so I wrote this. What you have to do is only to add query and to run th...
2020
2021
2022
2023
2024
2025