0.0
A long-lived project that still receives updates
A composable statistics library for Ruby. Histogram, Percentile, StandardDeviation, IQR.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Development

>= 0
 Project Readme

statistics.rb

A statistics library for Ruby.

Installation

gem install statistics.rb

Or add to your Gemfile:

gem 'statistics.rb'

Usage

Histogram

Create a histogram from an array of numeric values:

require 'statistics.rb'

values = [1.2, 3.4, 5.6, 7.8, 2.3, 4.5, 6.7, 8.9, 3.2, 5.4]

h = Statistics::Histogram.new(values)
puts h

Output:

    1.20...3.63  |   3 | ****************************************
    3.63...6.07  |   3 | ****************************************
    6.07...8.50  |   2 | ***************************
    8.50...10.93 |   1 | *************

Automatic bin width

By default, bin width is calculated using the square root rule: data_range / sqrt(n).

Other strategies are available via the method: option:

h = Statistics::Histogram.new(values, method: :square_root)       # default
h = Statistics::Histogram.new(values, method: :cube_root)
h = Statistics::Histogram.new(values, method: :freedman_diaconis)
h = Statistics::Histogram.new(values, method: :scott)
h = Statistics::Histogram.new(values, method: :sturges)

The tuneable_root method takes a factor (default 2.0, equivalent to square_root):

h = Statistics::Histogram.new(values, method: :tuneable_root, factor: 4.0)
Statistics::Bin.width(values, method: :tuneable_root, factor: 4.0)

Manual bin width

h = Statistics::Histogram.new(values, bin_width: 2.0)

Manual bin count

h = Statistics::Histogram.new(values, bin_count: 5)

Querying the histogram

h.bins          # => [#<Bin ...>, #<Bin ...>, ...]
h.boundaries    # => [1.2, 3.77, 6.34, ...]
h.bin_count     # => 4
h.mode          # => #<Bin interval=1.2...3.77 count=3>

h.bins.first.interval  # => 1.2...3.77
h.bins.first.count     # => 3
h.bins.first.width     # => 2.57
h.bins.first.empty?    # => false

Percentile

These methods employ linear interpolation. See Hyndman and Fan method 7.

require 'statistics.rb'

values = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Statistics::Percentile.of(values, 75)   # => 7.75
Statistics::Percentile.q25(values)      # => 3.25
Statistics::Percentile.q75(values)      # => 7.75

Standard Deviation

Statistics::StandardDeviation.of(values)                # => population (default)
Statistics::StandardDeviation.of(values, sample: true)  # => sample (Bessel's correction)

Interquartile Range

Statistics::IQR.of(values)  # => 4.5

Roadmap

  • Optional per-bin value storage
  • Aligned/neat bin boundaries

Contributing

  1. Fork it https://github.com/thoran/statistics/fork
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new pull request

License

MIT