0.0
No commit activity in last 3 years
No release in over 3 years
A generic web crawler that doesn't crawl outside URLs.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

~> 1.0
~> 1.0.2
~> 2.7.0
~> 4.3.2
~> 0.7.0
~> 2.0.1
~> 3.12
~> 3.10.0
~> 5.0.0
~> 3.4.0
~> 0.37.0
~> 0.8.0

Runtime

 Project Readme

Iron Crawler

A generic web crawler.

Features

From a starting URL, it will crawl all links on that URL and print a list of URLs visited.

  • Follow href attributes contained in tags from the same domain
  • Ignores href attributes contained in tags from other domains (even subdomains)
  • Captures script src and link href tags for script and link tags respectively
  • Outputs a list of visited URLs

Getting Started

It's easy to get started!

Install

gem install iron-crawler

Run

iron-crawler <url>

The above command will crawl any site for you.

TODO