Repository is archived
No release in over a year
Retrieves a list of URLs to seed the crawler by publishing them to a RabbitMQ exchange.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

>= 0
~> 3.0
~> 3.18
>= 0

Runtime

>= 1.3, < 3.0
>= 1.6, < 1.15
>= 1.4.6, < 5.1.0
>= 4.0, < 4.11
= 0.4.5
>= 0.3, < 0.6
 Project Readme

GOV.UK: Seed the Crawler

This gem retrieves a list of seed URLs from the GOV.UK sitemap and adds them to RabbitMQ so that the crawler can consume them.

Installation

Add this line to your application's Gemfile:

gem 'govuk_seed_crawler'

And then execute:

$ bundle

Or install it yourself as:

$ gem install govuk_seed_crawler

Usage

To run with the RabbitMQ connection defaults:

bundle exec seed-crawler https://www.gov.uk/

Run with --help to see a list of options:

bundle exec seed-crawler --help

Deployment

The gem is automatically deployed to RubyGems when the gem version is updated on main. (Don't forget to add to the CHANGELOG!

For the new gem version to be used on GOV.UK, you'll need to update the reference in govuk-puppet.

Contributing

  1. Fork it ( http://github.com/{my-github-username}/govuk_seed_crawler/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

Licence

MIT License