Search results for 'crawler' - The Ruby Toolbox

Projects

0.0

No commit activity in last 3 years

No release in over 3 years

medusa-crawler brutuscat/medusa-crawler Homepage

== Medusa: a ruby crawler framework {rdoc-image:https://badge.fury.io/rb/medusa-crawler.svg}[https://rubygems.org/gems/medusa-crawler] rdoc-image:https://github.com/brutuscat/medusa-crawler/workflows/Ruby/badge.svg?event=push Medusa is a framework for the ruby language to crawl and collect usefu...

2021

2022

2023

2024

2025

2026

5,267

1.0.0

2020-08-06

2020-08-17

Show more project details Compare

cobweb

0.08

Web Content Scrapers

No commit activity in last 3 years

No release in over 3 years

There's a lot of open issues

cobweb stewartmckee/cobweb Homepage

Cobweb is a web crawler that can use resque to cluster crawls to quickly crawl extremely large sites which is much more performant than multi-threaded crawlers. It is also a standalone crawler that has a sophisticated statistics monitoring interface to monitor the progress of the crawls.

2021

2022

2023

2024

2025

2026

351,568

224

1.2.1

2010-11-10

2021-01-09

Show more project details Compare

crawler_detect

0.08

User Agent Detection

Low commit activity in last 3 years

A long-lived project that still receives updates

crawler_detect loadkpi/crawler_detect Homepage

CrawlerDetect is a library to detect bots/crawlers via the user agent

2021

2022

2023

2024

2025

2026

2,305,694

144

1.2.11

1980-01-02

2025-11-06

Show more project details Compare

crawler-engine

0.0

No release in over 3 years

crawler-engine Homepage

Crawler Engine provides function of crawl all news from the customized website

2021

2022

2023

2024

2025

2026

5,353

0.1.0

2011-11-22

2011-11-22

Show more project details Compare

senthor_rails_legacy

0.0

No release in over 3 years

senthor_rails_legacy Homepage

Protect your content from AI crawlers and monetize every request with Senthor. Real-time detection, crawler control, and detailed analytics.

2021

2022

2023

2024

2025

2026

787

1.1.0

1980-01-02

1980-01-02

Show more project details Compare

embulk-filter-crawler

0.0

No commit activity in last 3 years

No release in over 3 years

embulk-filter-crawler toyama0919/embulk-filter-crawler Homepage

Crawler4J filter plugin for Embulk

2021

2022

2023

2024

2025

2026

11,383

0.1.3

2016-03-25

2016-04-06

Show more project details Compare

senthor_rails

0.0

No release in over 3 years

senthor_rails Homepage

Protect your content from AI crawlers and monetize every request with Senthor. Real-time detection, crawler control, and detailed analytics.

2021

2022

2023

2024

2025

2026

587

1.1.0

1980-01-02

1980-01-02

Show more project details Compare

bank-crawlers-hapoalim

0.0

No commit activity in last 3 years

No release in over 3 years

bank-crawlers-hapoalim joaomilho/bank-crawlers-hapoalim Homepage

A crappy crawler for a crappy bank interface

2021

2022

2023

2024

2025

2026

6,895

0.0.7

2015-04-03

2015-04-03

Show more project details Compare

wombat

0.38

Web Content Scrapers

No release in over 3 years

Low commit activity in last 3 years

There's a lot of open issues

wombat felipecsl/wombat Homepage

Generic Web crawler with a DSL that parses structured data from web pages

2021

2022

2023

2024

2025

2026

235,684

1,355

3.2.0

1980-01-02

2022-08-23

Show more project details Compare

voight_kampff

0.22

No release in over 3 years

Low commit activity in last 3 years

voight_kampff biola/voight-kampff Homepage

Voight-Kampff detects bots, spiders, crawlers and replicants

2021

2022

2023

2024

2025

2026

9,449,723

191

2.0.0

2011-05-11

2023-03-12

Show more project details Compare

is_crawler

0.01

No commit activity in last 3 years

No release in over 3 years

is_crawler ccashwell/is_crawler Homepage

is_crawler does exactly what you might think it does: determine if the supplied string matches a known crawler or bot.

2021

2022

2023

2024

2025

2026

176,222

0.1.5

2013-02-27

2013-05-23

Show more project details Compare

zy_crawler

0.0

No commit activity in last 3 years

No release in over 3 years

zy_crawler uuensky/zycrawler Homepage

A simple crawler demo crawler

2021

2022

2023

2024

2025

2026

1,591

0.0.1

2022-03-08

2022-03-08

Show more project details Compare

creepy-crawler

0.0

No commit activity in last 3 years

No release in over 3 years

creepy-crawler udryan10/creepy-crawler Homepage

web crawler that generates a sitemap to a neo4j database. It will also store broken_links and total number of pages on site

2021

2022

2023

2024

2025

2026

7,936

1.0.2

2014-05-10

2014-05-10

Show more project details Compare

cve_crawler

0.0

Repository is archived

No commit activity in last 3 years

No release in over 3 years

cve_crawler zarthus/ruby-cve-crawler Homepage

A periodic crawler that fetches the latest CVE additions, parses them, and filters them

2021

2022

2023

2024

2025

2026

11,058

0.3.0

2015-09-27

2015-09-27

Show more project details Compare

iron-crawler

0.0

No commit activity in last 3 years

No release in over 3 years

iron-crawler noqcks/iron-crawler Homepage

A generic web crawler that doesn't crawl outside URLs.

2021

2022

2023

2024

2025

2026

17,589

1.2.1

2016-02-07

2016-02-08

Show more project details Compare

gildia_comics_crawler

0.0

No release in over 3 years

gildia_comics_crawler

Crawler for downloading comics from komiks.gildia.pl

2021

2022

2023

2024

2025

2026

9,957

0.0.3

2013-12-21

2013-12-22

Show more project details Compare

ruby-crawler

0.0

No commit activity in last 3 years

No release in over 3 years

ruby-crawler abams/crawler

Simple ruby web crawler

2021

2022

2023

2024

2025

2026

3,745

0.0.1

2014-05-20

2014-05-20

Show more project details Compare

resay_crawler

0.0

No release in over 3 years

resay_crawler Homepage

A simple web crawler gem

2021

2022

2023

2024

2025

2026

3,453

0.0.1

2015-05-23

2015-05-23

Show more project details Compare

murmuring_spider

0.0

Repository is gone

No release in over 3 years

murmuring_spider Homepage

MurmuringSpider is a concise Twitter crawler. When we write a data-mining / text-mining application based on twitter timeline, we have to collect and store tweets first. I am irritated with writing such crawler repeatedly, so I wrote this. What you have to do is only to add query and to run th...

2021

2022

2023

2024

2025

2026

4,633

0.0.2