0.0
No commit activity in last 3 years
No release in over 3 years
A generic web crawler that doesn't crawl outside URLs.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Development

~> 1.0
~> 1.0.2
~> 2.7.0
~> 4.3.2
~> 0.7.0
~> 2.0.1
~> 3.12
~> 3.10.0
~> 5.0.0
~> 3.4.0
~> 0.37.0
~> 0.8.0

Runtime

 Project Readme

Iron Crawler

A generic web crawler.

Features

From a starting URL, it will crawl all links on that URL and print a list of URLs visited.

  • Follow href attributes contained in tags from the same domain
  • Ignores href attributes contained in tags from other domains (even subdomains)
  • Captures script src and link href tags for script and link tags respectively
  • Outputs a list of visited URLs

Getting Started

It's easy to get started!

Install

gem install iron-crawler

Run

iron-crawler <url>

The above command will crawl any site for you.

TODO