Search results for 'crawler' - The Ruby Toolbox

Iudex is a general purpose web crawler and feed processor in ruby/java. The iudex-brutefuzzy-protobuf gem contains the protocol buffer generated java classes for the iudex-brutefuzzy-service.

2019

2020

2021

2022

2023

2024

11,603

1.3.0

2012-03-05

2013-10-30

Show more project details Compare

iudex-brutefuzzy-service

0.0

No release in over 3 years

iudex-brutefuzzy-service Homepage

Iudex is a general purpose web crawler and feed processor in ruby/java. The iudex-brutefuzzy-service provides a fuzzy simhash lookup index as a distributed service.

2019

2020

2021

2022

2023

2024

11,864

1.3.0

2012-03-05

2013-10-30

Show more project details Compare

semantic-crawler

0.01

No commit activity in last 3 years

No release in over 3 years

There's a lot of open issues

semantic-crawler obale/semantic_crawler Homepage

SemanticCrawler is a ruby library that encapsulates data gathering from different sources. Currently microdata from websites, country information from Freebase, Factbook and FAO (Food and Agriculture Organization of the United Nations), crisis information from GDACS.org and geo data from LinkedGe...

2019

2020

2021

2022

2023

2024

38,321

0.7.1

2012-03-25

2013-04-07

Show more project details Compare

app-reviews

0.0

No release in over 3 years

app-reviews

Mobile App Review Crawler

2019

2020

2021

2022

2023

2024

9,593

0.0.2

2012-03-29

2012-03-29

Show more project details Compare

murmuring_spider

0.0

Repository is gone

No release in over 3 years

murmuring_spider Homepage

MurmuringSpider is a concise Twitter crawler. When we write a data-mining / text-mining application based on twitter timeline, we have to collect and store tweets first. I am irritated with writing such crawler repeatedly, so I wrote this. What you have to do is only to add query and to run th...

2019

2020

2021

2022

2023

2024

4,182

0.0.2

2012-04-13

2012-04-13

Show more project details Compare

caule

0.0

No commit activity in last 3 years

No release in over 3 years

caule rafaelss/caule Homepage

DSL to build crawlers easily

2019

2020

2021

2022

2023

2024

4,664

0.0.1

2012-04-14

2012-04-14

Show more project details Compare

omelete

0.0

Repository is gone

No release in over 3 years

omelete Homepage

Ruby web crawler to access omelete informations

2019

2020

2021

2022

2023

2024

50,252

2.0.7

2012-05-06

2013-01-25

Show more project details Compare

krawler

0.0

Repository is gone

No release in over 3 years

krawler Homepage

Simple little website crawler.

2019

2020

2021

2022

2023

2024

60,507

1.0.14

2012-05-10

2013-03-19

Show more project details Compare

skyscraper

0.0

Repository is gone

No release in over 3 years

skyscraper Homepage

Easy to use DSL that helps scraping data from websites. Thanks to it, writing web crawlers would be very fast and intuitive. Traversing through html nodes and fetching all of the HTML attributes, would be possible. Just like in jQuery - you will find methods like parent, children, first, find, si...

2019

2020

2021

2022

2023

2024

15,174

0.1.0

2012-05-17

2012-05-30

Show more project details Compare

baidu_crawler

0.0

No commit activity in last 3 years

No release in over 3 years

baidu_crawler debbbbie/baidu_crawler Homepage

The Baidu Crawler is to crawl data with your demmand

2019

2020

2021

2022

2023

2024

5,966

0.0.1

2012-09-01

2012-09-01

Show more project details Compare

attribute_imagifiable

0.0

Repository is archived

No commit activity in last 3 years

No release in over 3 years

attribute_imagifiable zealot128/attribute_imagifiable Homepage

Using paperclip to generate images from sensible attributes like e-mails and telephone numbers, in order to reduce crawler's success

2019

2020

2021

2022

2023

2024

27,687

0.0.8

2012-10-08

2013-07-31

Show more project details Compare

masque

0.01

No commit activity in last 3 years

No release in over 3 years

masque uu59/masque Homepage

JavaScript enabled web crawler kit

2019

2020

2021

2022

2023

2024

45,226

0.4.3

2012-10-19

2014-10-19

Show more project details Compare

the_country_identity

0.0

No commit activity in last 3 years

No release in over 3 years

the_country_identity p1nox/the_country_identity Homepage

CIA World Factbook crawler

2019

2020

2021

2022

2023

2024

8,315

0.0.3

2012-11-23

2014-10-19

Show more project details Compare

apollo-crawler

0.01

No commit activity in last 3 years

No release in over 3 years

apollo-crawler korczis/apollo-crawler Homepage

Gem for crawling data from external sources

2019

2020

2021

2022

2023

2024

266,934

0.1.31

2013-02-23

2013-03-27

Show more project details Compare

is_crawler

0.02

No commit activity in last 3 years

No release in over 3 years

is_crawler ccashwell/is_crawler Homepage

is_crawler does exactly what you might think it does: determine if the supplied string matches a known crawler or bot.

2019

2020

2021

2022

2023

2024

160,932

0.1.5

2013-02-27

2013-05-23

Show more project details Compare

cosmicrawler

0.02

Repository is archived

No commit activity in last 3 years

No release in over 3 years

cosmicrawler bash0c7/cosmicrawler Homepage

Cosmicrawler is crawler library for Ruby. It provides scalable asynchronous crawling by (http|file|etc) using EventMachine.

2019

2020

2021

2022

2023

2024

5,076

0.0.1

2013-03-11

2013-03-11

Show more project details Compare

google_ajax_crawler

0.03

No commit activity in last 3 years

No release in over 3 years

google_ajax_crawler benkitzelman/google-ajax-crawler Homepage

Rack Middleware adhering to the Google Ajax Crawling Scheme, using a headless browser to render JS heavy pages and serve a dom snapshot of the rendered state to a requesting search engine.

2019

2020

2021

2022

2023

2024

15,855

0.2.0

2013-03-16

2013-07-13

Show more project details Compare