Search results for 'crawler' - The Ruby Toolbox

Cobweb is a web crawler that can use resque to cluster crawls to quickly crawl extremely large sites which is much more performant than multi-threaded crawlers. It is also a standalone crawler that has a sophisticated statistics monitoring interface to monitor the progress of the crawls.

2019

2020

2021

2022

2023

2024

323,704

228

1.2.1

2010-11-10

2021-01-09

Show more project details Compare

google_ajax_crawler

0.03

No commit activity in last 3 years

No release in over 3 years

google_ajax_crawler benkitzelman/google-ajax-crawler Homepage

Rack Middleware adhering to the Google Ajax Crawling Scheme, using a headless browser to render JS heavy pages and serve a dom snapshot of the rendered state to a requesting search engine.

2019

2020

2021

2022

2023

2024

15,812

0.2.0

2013-03-16

2013-07-13

Show more project details Compare

wayback_archiver

0.03

No release in over 3 years

Low commit activity in last 3 years

wayback_archiver buren/wayback_archiver Homepage

Post URLs to Wayback Machine (Internet Archive), using a crawler, from Sitemap(s) or a list of URLs.

2019

2020

2021

2022

2023

2024

46,017

1.4.0

2014-07-17

2021-04-23

Show more project details Compare

rubyretriever

0.08

No release in over 3 years

Low commit activity in last 3 years

There's a lot of open issues

rubyretriever joenorton/rubyretriever Homepage

Asynchronous web crawler, scraper and file harvester

2019

2020

2021

2022

2023

2024

67,085

141

1.4.6

2014-05-25

2016-04-11

Show more project details Compare

instagram-crawler

0.08

No release in over 3 years

Low commit activity in last 3 years

There's a lot of open issues

instagram-crawler mgleon08/instagram-crawler Homepage

Crawl instagram photos, posts and videos for download.

2019

2020

2021

2022

2023

2024

7,333

197

0.3.0

2018-11-23

2019-04-14

Show more project details Compare

masque

0.01

No commit activity in last 3 years

No release in over 3 years

masque uu59/masque Homepage

JavaScript enabled web crawler kit

2019

2020

2021

2022

2023

2024

45,134

0.4.3

2012-10-19

2014-10-19

Show more project details Compare

pagemunch

0.0

No commit activity in last 3 years

No release in over 3 years

pagemunch pagemunch/pagemunch-ruby Homepage

A client for the PageMunch web crawler API

2019

2020

2021

2022

2023

2024

5,371

1.0.0

2013-04-13

2016-12-30

Show more project details Compare

rdig

0.01

No commit activity in last 3 years

No release in over 3 years

rdig jkraemer/rdig Homepage

Website crawler and fulltext indexer.

2019

2020

2021

2022

2023

2024

48,101

0.3.12

2006-03-25

2009-04-25

Show more project details Compare

spiderman

0.01

No release in over 3 years

Low commit activity in last 3 years

spiderman bkeepers/spiderman Homepage

your friendly neighborhood web crawler

2019

2020

2021

2022

2023

2024

4,524

2.0.0

2020-03-22

2020-03-22

Show more project details Compare

render_static

0.01

No commit activity in last 3 years

No release in over 3 years

render_static herval/render_static Homepage

render_static allows you to make your single-page apps (Backbone, Angular, etc) built on Rails SEO-friendly. It works by injecting a small rack middleware that will render pages as plain html, when the requester is one of the most common crawlers/bots out there (Google, Yahoo Baidu and Bing)

2019

2020

2021

2022

2023

2024

4,133

0.0.0