GeoBlacklight Sidecar Images
Store local copies of remote imagery in GeoBlacklight.
- Requirements
 - Installation
 - Rake Tasks
 - View Customization
 - Development
 
Description
This GeoBlacklight plugin captures remote images from geographic web services and saves them locally. It borrows the concept of a SolrDocumentSidecar from Spotlight, to have an ActiveRecord-based "sidecar" to match each non-AR SolrDocument. This allows us to use ActiveStorage to attach images to our solr documents.
Example Screenshot
Requirements
Suggested
- Background Job Processor
 
Sidekiq is an excellent choice if you need an opinion.
Installation
Existing GeoBlacklight Instance
GeoBlacklight v4 with Aardvark metadata / Add the gem to your Gemfile.
gem "geoblacklight_sidecar_images", "~> 1.0"GeoBlacklight v3 with GBL v1.0 metadata / Add the gem to your Gemfile.
gem "geoblacklight_sidecar_images", "~> 0.9.1", "< 1.0"Run the generator.
$ bin/rails generate geoblacklight_sidecar_images:installRun the database migration.
$ bin/rails db:migrateComplete any necessary Active Storage setup steps, for example:
- Add a config/storage.yml file
 
local:
  service: Disk
  root: <%= Rails.root.join("storage") %>
- Add config/environments declarations, development.rb for example:
 
# Store uploaded files on the local file system (see config/storage.yml for options)
config.active_storage.service = :local
New GeoBlacklight Instance
Create a new GeoBlacklight instance with the GBLSI code
$ rails new app-name -m https://raw.githubusercontent.com/geoblacklight/geoblacklight_sidecar_images/develop/template.rb
Ingest Test Documents
  # Run your GBL instance
  bundle exec rake geoblacklight:server  # Index the GBL test fixtures
bundle exec rake gblsci:sample_data:seedRake tasks
Harvest images
Harvest all images
Spawns background jobs to harvest images for all documents in your Solr index.
bundle exec rake gblsci:images:harvest_allHarvest an individual image
Allows you to add images one document id at a time. Pass a DOC_ID env var.
DOC_ID='stanford-cz128vq0535' bundle exec rake gblsci:images:harvest_doc_idHarvest all incomplete states
Reattempt image harvesting for all non-successful state objects.
bundle exec rake gblsci:images:harvest_retryCheck image states
bundle exec rake gblsci:images:harvest_statesWe use a state machine library to track success/failure of our harvest tasks. The states we track are:
- initialized - SolrDocumentSidecar created, no harvest attempt run
 - queued - Harvest attempt queued as background job
 - processing - Harvest attempt at work
 - succeeded - Harvest was successful, image attached
 - failed - Harvest failed, no image attached, error logged
 - placeheld - Harvest was not successful, placeholder imagery will be used
 
SolrDocumentSidecar.in_state(:succeeded) => [#<SolrDocumentSidecar:0x0000000170697960 ... ]
SolrDocumentSidecar.image.attached? => false
SolrDocumentSidecar.image_state.current_state => "placeheld"
SolrDocumentSidecar.image_state.last_transition => #<SidecarImageTransition id: 207, to_state: "placeheld", metadata: {"solr_doc_id"=>"stanford-cg357zz0321", "solr_version"=>1616509329754554368, "placeheld"=>true, "viewer_protocol"=>"wms", "image_url"=>"http://geowebservices-restricted.stanford.edu/geoserver/wms/reflect?&FORMAT=image%2Fpng&TRANSPARENT=TRUE&LAYERS=druid:cg357zz0321&WIDTH=300&HEIGHT=300", "service_url"=>"http://geowebservices-restricted.stanford.edu/geoserver/wms/reflect?&FORMAT=image%2Fpng&TRANSPARENT=TRUE&LAYERS=druid:cg357zz0321&WIDTH=300&HEIGHT=300", "gblsi_thumbnail_uri"=>false, "error"=>"Faraday::Error::ConnectionFailed"},...>Destroy images
Remove everything
Remove all sidecar objects and attached images
bundle exec rake gblsci:images:harvest_purge_allRemove orphaned AR objects
Remove all sidecar objects and attached images for AR objects without a corresponding Solr document
bundle exec rake gblsci:images:harvest_purge_orphansRemove a batch
Remove sidecar objects and attached images via a CSV file of document ids
bundle exec rake gblsci:images:harvest_destroy_batchTroubleshooting
Harvest report
Generate a CSV file of sidecar objects and associated image state. Useful for debugging problem items.
bundle exec rake gblsci:images:harvest_reportFailed state inspect
Prints details for failed state harvest objects to stdout
bundle exec rake gblsci:images:harvest_failed_state_inspectPrioritize Solr Thumbnail Field URIs
If you add a thumbnail uri to your geoblacklight solr documents...
Example Doc
{
  ...
  "dc_format_s":"TIFF",
  "dc_creator_sm":["Minnesota. Department of Highways."],
  "thumbnail_path_ss":"https://umedia.lib.umn.edu/sites/default/files/imagecache/square300/reference/562/image/jpeg/1089695.jpg",
  "dc_type_s":"Still image",
  ...
}Then you can edit your GeoBlacklight settings.yml file to point at that solr field (Settings.GBLSI_THUMBNAIL_FIELD). Any docs in your index that have a value for that field will harvest the image at that URI instead of trying to retrieve an image via IIIF or the other web services.
View customization
Use basic Active Storage patterns to display imagery in your application.
Example Methods
# Is there an image?
document.sidecar.image.attached?
# Can the image size be manipulated?
document.sidecar.image.variable?
# Example image_tag with resize
<%= image_tag document.sidecar.image.variant(resize_to_fit: [100, 100]), {class: 'media-object'} %>Search results
This GBL plugin includes a custom catalog/_index_split_default.html.erb file. Look there for examples on calling the image method.
Show pages
Example for adding a thumbnail to the show page sidebar.
catalog/_show_sidebar.html.erb
# Add to end of file
<% if @document.sidecar.image.attached? %>
  <% if @document.sidecar.image.variable? %>
    <div class="card">
      <div class="card-header">Thumbnail</div>
      <div class="card-body">
        <%= image_tag @document.sidecar.image.variant(resize_to_fit: [200, 200]), {class: 'mr-3'} %>
      </div>
    </div>
  <% end %>
<% end %>Development
# Run test suite
bundle exec rake ci
# Launch test app server
cd .internal_test_app/
bundle exec rake geoblacklight:server
# Load test fixtures
bundle exec rake gblsci:sample_data:seed
# Run harvest
bundle exec rake gblsci:images:harvest_all
# Tail image service log file
tail -f log/image_service_development.log
