Project

isotree

0.01
No release in over a year
Outlier/anomaly detection for Ruby using Isolation Forest
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Runtime

>= 4.7
 Project Readme

IsoTree Ruby

🌲 IsoTree - outlier/anomaly detection using Isolation Forest - for Ruby

Learn how Isolation Forest works

🌳 Check out OutlierTree for human-readable explanations of outliers

Build Status

Installation

Add this line to your application’s Gemfile:

gem "isotree"

Windows is not supported at the moment

Getting Started

Prep your data

data = [
  {department: "Books",  sale: false, price: 2.50},
  {department: "Books",  sale: true,  price: 3.00},
  {department: "Movies", sale: false, price: 5.00},
  # ...
]

Train a model

model = IsoTree::IsolationForest.new
model.fit(data)

Get outlier scores

model.predict(data)

Scores are between 0 and 1, with higher scores indicating outliers

Export the model

model.export_model("model.bin")

Import a model

model = IsoTree::IsolationForest.import_model("model.bin")

Parameters

Pass parameters - default values below

IsoTree::IsolationForest.new(
  sample_size: "auto",
  ntrees: 500,
  ndim: 3,
  ntry: 1,
  max_depth: "auto",
  ncols_per_tree: nil,
  prob_pick_pooled_gain: 0.0,
  prob_pick_avg_gain: 0.0,
  prob_pick_full_gain: 0.0,
  prob_pick_dens: 0.0,
  prob_pick_col_by_range: 0.0,
  prob_pick_col_by_var: 0.0,
  prob_pick_col_by_kurt: 0.0,
  min_gain: 0.0,
  missing_action: "auto",
  new_categ_action: "auto",
  categ_split_type: "auto",
  all_perm: false,
  coef_by_prop: false,
  sample_with_replacement: false,
  penalize_range: false,
  standardize_data: true,
  scoring_metric: "depth",
  fast_bratio: true,
  weigh_by_kurtosis: false,
  coefs: "uniform",
  assume_full_distr: true,
  min_imp_obs: 3,
  depth_imp: "higher",
  weigh_imp_rows: "inverse",
  random_seed: 1,
  use_long_double: false,
  nthreads: -1
)

See a detailed explanation

Data

Data can be an array of hashes

[
  {department: "Books",  sale: false, price: 2.50},
  {department: "Books",  sale: true,  price: 3.00},
  {department: "Movies", sale: false, price: 5.00}
]

Or a Rover data frame

Rover.read_csv("data.csv")

Or a Numo array

Numo::NArray.cast([[1, 2, 3], [4, 5, 6]])

Performance

IsoTree uses OpenMP when possible for best performance. To enable OpenMP on Mac, run:

brew install libomp

Then reinstall the gem.

gem uninstall isotree --force
bundle install

Deployment

Check out Trove for deploying models.

trove push model.bin

Reference

Get the average isolation depth

model.predict(data, output: "avg_depth")

History

View the changelog

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help:

To get started with development:

git clone --recursive https://github.com/ankane/isotree-ruby.git
cd isotree-ruby
bundle install
bundle exec rake compile
bundle exec rake test