0.0
No commit activity in last 3 years
No release in over 3 years
PURL doc => Solr hash logic
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

Runtime

 Project Readme

Build Status Coverage Status Dependency Status Gem Version

gdor-indexer

Code to harvest DOR druids via DOR Fetcher service, mods from PURL, and use it to index items into a Solr index, such as that for SearchWorks.

Prerequisites

  1. ruby 2.x+
  2. bundler gem must be installed

Install steps for running locally

Add this line to your application's Gemfile:

gem 'harvestdor-indexer'

Then execute:

bundle

Configuration

Create a collections folder in the config directory:

cd /path/to/gdor-indexer/config
mkdir collections

Create a yml config file for your collection(s) to be harvested and indexed.

See spec/config/walters_integration_spec.yml for an example. Copy that file to config/collections and change the following settings:

  • whitelist
  • dor_fetcher service_url
  • harvestdor log_dir and log_name
  • solr_url

whitelist

The whitelist is how you specify which objects to index. The whitelist can be:

  • an Array of druids inline in the config yml file
  • a filename containing a list of druids (one per line)

If a druid, per the object's identityMetadata at purl page, is for a:

  • collection record: then we process all the item druids in that collection (as if they were included individually in the whitelist)
  • non-collection record: then we process the druid as an individual item

Run the indexer script

$ cd /path/to/gdor-indexer
$ nohup ./bin/indexer -c my_collection &>path/to/nohup.output

Running the tests

rake

Contributing