Search results for 'crawler' - The Ruby Toolbox

With just a few lines of code, developers can effortlessly integrate this gem into their projects, enabling seamless retrieval of page titles from HTML documents. Whether you're building web scrapers, crawlers, or any application that requires fetching webpage titles, WebTitle streamlines the pro...

2019

2020

2021

2022

2023

2024

278

1.0.0

2023-12-13

2023-12-13

Show more project details Compare

rails-hush

0.0

The project is in a healthy, maintained state

rails-hush zarqman/rails-hush Homepage

Hushes worthless Rails exceptions & logs, such as those caused by bots and crawlers.

2019

2020

2021

2022

2023

2024

6,288

1.1.2

2019-11-03

2023-10-05

Show more project details Compare

kudzu

0.0

Low commit activity in last 3 years

A long-lived project that still receives updates

kudzu kanety/kudzu Homepage

A simple web crawler for ruby

2019

2020

2021

2022

2023

2024

25,814

1.3.1

2017-12-20

2023-06-23

Show more project details Compare

govuk_seed_crawler

0.0

Repository is archived

No release in over a year

govuk_seed_crawler alphagov/govuk_seed_crawler Homepage

Retrieves a list of URLs to seed the crawler by publishing them to a RabbitMQ exchange.

2019

2020

2021

2022

2023

2024

10,999

3.2.1

2015-08-28

2023-03-22

Show more project details Compare

voight_kampff

0.26

Low commit activity in last 3 years

No release in over a year

voight_kampff biola/voight-kampff Homepage

Voight-Kampff detects bots, spiders, crawlers and replicants

2019

2020

2021

2022

2023

2024

6,442,350

177

2.0.0

2011-05-11

2023-03-12

Show more project details Compare

validate-website

0.03

Low commit activity in last 3 years

No release in over a year

validate-website spk/validate-website Homepage

validate-website is a web crawler for checking the markup validity with XML Schema / DTD and not found urls.

2019

2020

2021

2022

2023

2024

125,279

1.12.0

2009-10-24

2022-11-15

Show more project details Compare

coolCrawler

0.0

No release in over a year

coolCrawler willwright1213/coolcrawler Homepage

Simple Web Crawler

2019

2020

2021

2022

2023

2024

3,707

0.4.4

2022-09-29

2022-11-01

Show more project details Compare

wombat

0.55

Web Content Scrapers

Low commit activity in last 3 years

There's a lot of open issues

No release in over a year

wombat felipecsl/wombat Homepage

Generic Web crawler with a DSL that parses structured data from web pages

2019

2020

2021

2022

2023

2024

204,807

1,303

3.0.0

2011-12-27

2022-08-23

Show more project details Compare

zy_crawler

0.0

No release in over a year

zy_crawler uuensky/zycrawler Homepage

A simple crawler demo crawler

2019

2020

2021

2022

2023

2024

1,174

0.0.1

2022-03-08

2022-03-08

Show more project details Compare

vscinemas

0.0

No release in over a year

vscinemas elct9620/vscinemas-rb Homepage

The Taiwan VSCinema crawler to get latest film list.

2019

2020

2021

2022

2023

2024

3,389

0.2.1

2021-12-20

2021-12-21

Show more project details Compare

crawler_guru

0.0

Repository is gone

No release in over a year

crawler_guru Homepage

Crawler Guru provides all basic functionalities to extract data from web pages

2019

2020

2021

2022

2023

2024

1,832

0.1.0

2021-09-03

2021-09-03

Show more project details Compare

wayback_archiver

0.03

No release in over 3 years

Low commit activity in last 3 years

wayback_archiver buren/wayback_archiver Homepage

Post URLs to Wayback Machine (Internet Archive), using a crawler, from Sitemap(s) or a list of URLs.

2019

2020

2021

2022

2023

2024

46,017

1.4.0

2014-07-17

2021-04-23

Show more project details Compare

webget

0.0

No release in over 3 years

Low commit activity in last 3 years

webget rubycoco/webclient Homepage

webget gem - a web (go get) crawler incl. web cache

2019

2020

2021

2022

2023

2024

12,887

0.2.5

2020-10-04

2021-02-21

Show more project details Compare

grell

0.02

No commit activity in last 3 years

No release in over 3 years

grell mdsol/grell Homepage

Ruby web crawler using PhantomJS

2019

2020

2021

2022

2023

2024

86,182

2.1.2

2015-05-07

2021-02-17

Show more project details Compare

cobweb

0.13

Web Content Scrapers

No release in over 3 years

Low commit activity in last 3 years

There's a lot of open issues

cobweb stewartmckee/cobweb Homepage

Cobweb is a web crawler that can use resque to cluster crawls to quickly crawl extremely large sites which is much more performant than multi-threaded crawlers. It is also a standalone crawler that has a sophisticated statistics monitoring interface to monitor the progress of the crawls.

2019

2020

2021

2022

2023

2024

323,704

228

1.2.1

2010-11-10

2021-01-09

Show more project details Compare

rack-pesticide

0.0

No commit activity in last 3 years

No release in over 3 years

rack-pesticide mdippery/rack-pesticide Homepage

Block crawlers who spam your site with fake HTTP referers

2019

2020

2021

2022

2023

2024

3,727

1.0.5

2016-09-08

2020-11-21

Show more project details Compare

reddit_junkie

0.0

No release in over 3 years

reddit_junkie Homepage

This little library helps people download images from different subs much easier. It's actually like a crawler for the images posted on a subreddit. Actually, it's a great tool to have your favorite memes locally!

2019

2020

2021

2022

2023

2024

9,063

0.0.7

2020-09-25

2020-10-09

Show more project details Compare

Categories

Projects