Project

glean

0.04
No commit activity in last 3 years
No release in over 3 years
A description of your project
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

Runtime

~> 1.2
~> 2.5
~> 2.0
~> 1.7
~> 2.0
~> 0.0.3
 Project Readme

Glean was a fun expriement, but is no longer maintained.

Glean - A data management tool for humans

Glean is experimental, expect breaking changes until v1.0.0

About

Glean targets human curated datasets, with a goal of easy collaboration.

Data is stored in the human readable data format, TOML. You can think of it as Markdown for data. Each dataset is stored in a git repository, which makes it easy to track revisions, propose changes, and collaborate on datasets.

Each file represents one piece of data (a hash of hashes). Filenames and directory structure are not significant to the data, but are useful for organization and human collaboration via Pull Requests.

Goals

  • Easily pull commonly used datasets into projects
  • Curate data using Pull Requests
  • Preserve attribution for contributors

Sources

Glean datasets are available from three distinct sources:

  1. Core
  • Available via search
  • Hosted on the Glean GitHub organization
  1. Contrib
  • Available via search using --contrib
  • Hosted on GitHub and cataloged by Glean Contrib
  1. User
  • TODO
  • Directly available via URL

Installation

$ gem install glean

Requirements:

  • Git

Command line

$ glean help
NAME
    glean - A data management tool for humans

SYNOPSIS
    glean [global options] command [command options] [arguments...]

VERSION
    0.0.13

GLOBAL OPTIONS
    --help    - Show this message
    --version - 

COMMANDS
    export - Export a dataset
    get    - Download a dataset by name
    help   - Shows a list of commands or help for one command
    info   - Show dataset information
    search - Search for datasets

Examples

Core:

$ glean export countries --format=json
{"name":"Andorra","code":"ad"}
{"name":"United Arab Emirates","code":"ae"}
{"name":"Afghanistan","code":"af"}
...
$ glean export us-states --format=yaml
--- !ruby/hash:Hashie::Mash
name: Alaska
abbreviation: ak
--- !ruby/hash:Hashie::Mash
name: Alabama
abbreviation: al
--- !ruby/hash:Hashie::Mash
name: Arkansas
abbreviation: ar
...

Contrib:

$ glean export lagalaxy/trophies --format=json
{"competition":"MLS Supporters' Shield","year":1998}
{"competition":"CONCACAF Champions' League","year":2000}
{"competition":"Lamar Hunt U.S. Open Cup","year":2001}
...

Rails

Gemfile:

gem 'glean'

db/seeds.rb:

if Country.count == 0
  countries = Glean::Dataset.new('glean/countries')
  countries.each do |country|
    Country.create :name => country.name, :code => country.code
  end
end
$ rake db:seed

Other Frameworks

I'm not sure how you'd do it, but I want to make it easy. Open an issue, or better yet drop some code in a Pull Request.