Search results for 'crawler' - The Ruby Toolbox

SemanticCrawler is a ruby library that encapsulates data gathering from different sources. Currently microdata from websites, country information from Freebase, Factbook and FAO (Food and Agriculture Organization of the United Nations), crisis information from GDACS.org and geo data from LinkedGe...

2019

2020

2021

2022

2023

2024

38,416

0.7.1

2012-03-25

2013-04-07

Show more project details Compare

crawler

0.01

No commit activity in last 3 years

No release in over 3 years

crawler tylercunnion/crawler Homepage

BFS webcrawler that implements Observable

2019

2020

2021

2022

2023

2024

13,876

0.2.1

2010-01-25

2010-01-25

Show more project details Compare

ruby-cheerio

0.0

No commit activity in last 3 years

No release in over 3 years

ruby-cheerio dineshsprabu/ruby-cheerio Homepage

Ruby Cheerio is a jQuery style HTML parser, which take selectors as input. This is a Ruby version NodeJS package named 'Cheerio', which is extensively used by crawlers. Please visit the home page for usage details.

2019

2020

2021

2022

2023

2024

35,734

0.0.5

2016-08-09

2016-08-09

Show more project details Compare

govuk_seed_crawler

0.0

Repository is archived

No release in over a year

govuk_seed_crawler alphagov/govuk_seed_crawler Homepage

Retrieves a list of URLs to seed the crawler by publishing them to a RabbitMQ exchange.

2019

2020

2021

2022

2023

2024

11,082

3.2.1

2015-08-28

2023-03-22

Show more project details Compare

polipus-elasticsearch

0.0

No commit activity in last 3 years

No release in over 3 years

polipus-elasticsearch stefanofontanelli/polipus-elasticsearch Homepage

Add support for ElasticSearch in Polipus crawler

2019

2020

2021

2022

2023

2024

11,761

0.0.4

2015-07-17

2015-09-14

Show more project details Compare

masque

0.01

No commit activity in last 3 years

No release in over 3 years

masque uu59/masque Homepage

JavaScript enabled web crawler kit

2019

2020

2021

2022

2023

2024

45,322

0.4.3

2012-10-19

2014-10-19

Show more project details Compare

arachnid2

0.01

Web Content Scrapers

No release in over 3 years

Low commit activity in last 3 years

arachnid2 samnissen/arachnid2 Homepage

A simple, fast web crawler

2019

2020

2021

2022

2023

2024

29,239

0.4.0

2018-05-29

2020-07-15

Show more project details Compare

rails_angular_seo

0.0

No commit activity in last 3 years

No release in over 3 years

rails_angular_seo arunn/rails_angular_seo Homepage

rails_angular_seo allows you to make your single-page apps (Backbone, Angular, etc) built on Rails SEO-friendly. It works by injecting a small rack middleware that will render pages as plain html, when the requester is one of the most common crawlers/bots out there (Google, Yahoo Baidu and Bing)

2019

2020

2021

2022

2023

2024

8,086

0.0.8

2014-10-01

2014-10-01

Show more project details Compare

fua

0.0

No commit activity in last 3 years

No release in over 3 years

fua behdadahmadi/fua Homepage

Fake User-Agents of about %80 of real devices to use in headers of web crawlers. It keeps your script away from being nested by many UA strings.

2019

2020

2021

2022

2023

2024

2,964

1.0.3

2016-12-04

2016-12-04

Show more project details Compare

newrank

0.0

No commit activity in last 3 years

No release in over 3 years

newrank liqites/newrank Homepage

A Crawler for NewRank

2019

2020

2021

2022

2023

2024

7,625

0.3.3

2016-10-25

2016-10-25

Show more project details Compare

rack_staging

0.0

No commit activity in last 3 years

No release in over 3 years

rack_staging glenngillen/rack_staging Homepage

Automatically protects your staging app from web crawlers and casual visitors.

2019

2020

2021

2022

2023

2024

20,519

0.2.0

2011-08-14

2011-10-08

Show more project details Compare

preadly-bulbasaur

0.0

No commit activity in last 3 years

No release in over 3 years

preadly-bulbasaur preadly/bulbasaur Homepage

Bulbasaur is a helper for crawler operations used in Pread.ly

2019

2020

2021

2022

2023

2024

34,152

0.9.0

2015-07-13

2015-12-23

Show more project details Compare

wriggle

0.01

No commit activity in last 3 years

No release in over 3 years

wriggle tsigo/wriggle Homepage

A simple directory crawler DSL.

2019

2020

2021

2022

2023

2024

14,059

1.3.0

2010-10-09

2011-03-09

Show more project details Compare

kudzu

0.0

Low commit activity in last 3 years

A long-lived project that still receives updates

kudzu kanety/kudzu Homepage

A simple web crawler for ruby

2019

2020

2021

2022

2023

2024

26,027

1.3.1

2017-12-20

2023-06-23

Show more project details Compare

botch

0.0

No commit activity in last 3 years

No release in over 3 years

botch namusyaka/botch Homepage

Botch is a DSL for quickly creating web crawlers. Inspired by Sinatra.

2019

2020

2021

2022

2023

2024

20,630

0.1.5

2013-07-14

2013-08-08

Show more project details Compare

browser_crawler

0.0

No commit activity in last 3 years

No release in over 3 years

There's a lot of open issues

browser_crawler dimasamodurov/browser_crawler Homepage

Simple site crawler using Capybara

2019

2020

2021

2022

2023

2024

4,174

0.4.1

2019-08-23

2020-05-13

Show more project details Compare

cangrejo

0.0

No commit activity in last 3 years

No release in over 3 years

cangrejo platanus/cangrejo-gem Homepage

Cangrejo lets you consume crabfarm crawlers using a simple DSL

2019

2020

2021

2022

2023

2024

64,265

0.2.5

2015-01-02

2016-03-30

Show more project details Compare

recipe_crawler

0.0

Repository is archived

No commit activity in last 3 years

No release in over 3 years

recipe_crawler madeindjs/recipe_crawler Homepage

This crawler will use my personnal scraper named 'RecipeScraper' to dowload recipes data from Marmiton, 750g or cuisineaz

2019

2020

2021

2022

2023

2024

12,379

4.0.0

2016-11-29

2018-12-08

Show more project details Compare