Rubadana
Rubadana is an elementary ruby data-analysis package. It works with plain old ruby objects, not sql or databases or anything fancy like that.
The aim is to create a summary overview from a list of objects by basically running a group-by/map/reduce operation on your list. The input, grouping, mapping, reducing, and display of the result are all independently variable.
Installation
Add this line to your application's Gemfile:
gem 'rubadana'And then execute:
$ bundle
Or install it yourself as:
$ gem install rubadana
Usage
Here's a trivial example, returning the number of requests per user:
program = Rubadana::Registry.build groupings: [:user], mappers: [:self], reducers: [:count]
analysis = program.run Request.allIn this example, :user, :self, and :count are plugins you will have provided to rubadana for extracting and manipulating your data.
analyser.run returns a list of Rubadana::Analysis instances, with, for this example, the following attributes:
|key | a Hash instance with keys :user |
|list | the subset of Request.all with the corresponding value for :user |
|mapped | in this case, the same as list (assuming the :identity mapper returns the thing itself) |
|reduced| a list of one integers, equal to the size of the list |
In ordinary ruby, you would write Request.all.group_by(:user).map {|user, requests| [user, requests.count] } to get the same information.
Here's a richer example which returns the sum of debits, credits, and account balances from a set of accounting transactions:
program = Rubadana::Analyser.new group: [:month, :account_number], map: [:debits, :credits, :balance], reduce: [:sum, :sum, :sum]
analysis = program.run AccountingTransaction.allIn this example, :month, :account_number, :debits and so on, are plugins you will have provided to rubadana for extracting and manipulating your data.
analyser.run returns a list of Rubadana::Analysis instances, with the following attributes:
|key | a Hash instance with keys :month and :account_number |
|list | the subset of AccountingTransaction.all having the corresponding values for :month and :account_number |
|mapped | the output of the map operations on list. This is a list of n-tuples, where n is the number of operations specified by the map parameter |
|reduced| the output of the reduce operations on mapped. This is a list of n values, one for each operation specified by the reduce parameter. |
In this example, reduced gives us the sum of all debits, the sum of all credits, and the sum of all balances, per account-number
See spec for some examples.
Steps
- Create a Registry for your mappers and your reducers
my_registry = Rubadana::Registry.new
- Create and register some mappers
class SaleYear
def name ; :yearly ; end
def run thing ; thing.date.year ; end
def label value ; value ; end
end
my_registry.register_mapper SaleYear.new- Create and register some reducers:
class Sum
def name ; :sum ; end
def reduce things ; things.reduce :+ ; end
end
my_registry.register_reducer Sum.new- Build an analysis program and run it:
# this is a program to analyse invoices by year and product, giving the
# number of sales, the sum of sales and the average sale in each case
my_program = register.build group: %i{ yearly }, map: %i{ self sale_amount sale_amount }, reduce: %i{ count sum average }
data = my_program.run(invoices)#run returns an array of Rubadana::Analysis as described above.
Contributing
- Fork it ( https://github.com/conanite/rubadana/fork )
- Create your feature branch (
git checkout -b my-new-feature) - Commit your changes (
git commit -am 'Add some feature') - Push to the branch (
git push origin my-new-feature) - Create a new Pull Request