Search results for 'crawler' - The Ruby Toolbox

2013-05-07

daimon_skycrawlers bm-sms/daimon_skycrawlers Homepage Documentation Source Code Bug Tracker

daimon_skycrawlers

0.01

Repository is archived

No commit activity in last 3 years

No release in over 3 years

This is a crawler framework.

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

41,263

Releases

1.0.0

2016-01-27

2017-02-15

Activity

100%

91%

2017-02-12

spiderman bkeepers/spiderman Homepage Documentation Source Code Bug Tracker Wiki

spiderman

0.01

No release in over 3 years

Low commit activity in last 3 years

your friendly neighborhood web crawler

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

4,541

Releases

2.0.0

2020-03-22

2020-03-22

Activity

100%

2020-08-22

driller shashikant86/driller Homepage Documentation Source Code Bug Tracker Wiki

driller

0.01

No commit activity in last 3 years

No release in over 3 years

Driller is a command line Ruby based web crawler based on Anemone. Driller can crawl website and reports error pages and slow pages and generates HTML reports.

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

33,873

Releases

0.1.4

2015-05-10

2015-05-18

Activity

2015-05-14

arachnid2 samnissen/arachnid2 Homepage Documentation Source Code Bug Tracker Wiki

arachnid2

Web Content Scrapers

0.01

Web Content Scrapers

No release in over 3 years

Low commit activity in last 3 years

A simple, fast web crawler

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

29,239

Releases

0.4.0

2018-05-29

2020-07-15

Activity

100%

69%

2019-04-15

marmiton_crawler madeindjs/marmiton_crawler Homepage Documentation Source Code Bug Tracker Wiki

marmiton_crawler

0.01

Repository is archived

No commit activity in last 3 years

No release in over 3 years

A web scrawler to get a Marmiton's recipe

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

4,683

Releases

1.0.3

2016-10-09

2016-11-28

Activity

75%

100%

2017-09-23

masque uu59/masque Homepage Documentation Source Code Bug Tracker Wiki

masque

0.01

No commit activity in last 3 years

No release in over 3 years

JavaScript enabled web crawler kit

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

45,322

Releases

0.4.3

2012-10-19

2014-10-19

Activity

100%

2013-05-23

semantic-crawler obale/semantic_crawler Homepage Documentation Source Code Bug Tracker Wiki

semantic-crawler

0.01

No commit activity in last 3 years

No release in over 3 years

There's a lot of open issues

SemanticCrawler is a ruby library that encapsulates data gathering from different sources. Currently microdata from websites, country information from Freebase, Factbook and FAO (Food and Agriculture Organization of the United Nations), crisis information from GDACS.org and geo data from LinkedGeoData are supported. Additional the GeoNames module allows to get Factbook and FAO country information from GPS coordinates.

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

38,416

Releases

0.7.1

2012-03-25

2013-04-07

Activity

64%

2012-07-30

youtube-crawler Homepage Documentation

youtube-crawler

0.0

No release in over 3 years

plugin-container totally sucks..

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

Downloads

3,471

Releases

0.0.1

2014-08-01

2014-08-01

Activity

feeds-crawler andrey17076/feed-crawler Homepage Documentation Source Code Bug Tracker Wiki

feeds-crawler

0.0

No commit activity in last 3 years

No release in over 3 years

This gem allows to crawl news articles from RSS feeds.

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

3,918

Releases

0.2.1

2017-09-16

2018-04-14

Activity

100%

2018-10-13

vscinemas elct9620/vscinemas-rb Homepage Documentation Source Code Bug Tracker Wiki

vscinemas

0.0

No release in over a year

The Taiwan VSCinema crawler to get latest film list.

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

3,432

Releases

0.2.1

2021-12-20

2021-12-21

Activity

2021-12-25

jobs_crawler Homepage Documentation

jobs_crawler

0.0

No release in over 3 years

Crawl the senegalese web, looking for jobs using the excellent wombat gem

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

Downloads

13,860

Releases

0.1.8

2019-04-02

2019-04-04

Activity

catflap nyk/catflap Homepage Documentation Source Code Bug Tracker Wiki

catflap

0.0

No commit activity in last 3 years

No release in over 3 years

A simple solution to provide on-demand service access (e.g. port 80 on webserver), where a more robust and secure VPN solution is not available. Essentially, it is a more user-friendly form of "port knocking". The original proof-of-concept implementation was run for almost three years by Demotix, to protect development and staging servers from search engine crawlers and other unwanted traffic.

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

10,603

Releases

1.0.1

2013-12-01

2016-03-14

Activity

100%

2016-01-18

arachnidish csphere/arachnid Homepage Documentation Source Code Wiki

arachnidish

0.0

No commit activity in last 3 years

No release in over 3 years

Arachnidish is a web crawler that relies on Bloom Filters to efficiently store visited urls and Typhoeus to avoid the overhead of Mechanize when crawling every page on a domain.

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

3,300

Releases

0.0.1

2014-01-16

2014-01-16

Activity

2012-05-11

content_crawler Documentation

content_crawler

0.0

No release in over 3 years

This will be crawling data from websites. Need to give the xpaths clearly. Will be updating with new functionalities in future

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

Popularity

Downloads

3,386

Releases

0.0.1

2014-12-23

2014-12-23

Activity