0.0
No commit activity in last 3 years
No release in over 3 years
An implementation of http://word.bitly.com/post/41284219720/forget-table in ruby
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

~> 0.3
~> 3.0

Runtime

~> 3.0
 Project Readme

Forgettable

Build Status Code Climate Gem Version

Forgettable helps you find the probability of non-stationary categorical distributions. To put it simply, you can find the most "popular" items in a stream of events, when their popularity changes unpredictably.

Why?

Imagine you have a web application in which your users can create a post and comment on it. Finding the "hottest" posts might be simply achieved by finding the most commented posts or most recently commented posts.

While these solution are simple to implement and work in many cases, they have some drawbacks. For example a post with a lot of old comments might be still reported as popular although nobody is actually commenting/reading it anymore. Or using the last commented time might generate a very unstable/fast changing list of "hottest" posts which does not really capture the trends among posts.

The main problem with these approaches is that consider old data as important as the new data: they don't forget.

Forgettable gives you a simple way to keep track of the most recent "trends" and smoothly forget about the past facts.

Forgettable is heavily inspired by Forget-Table, developed at bitly.

How to use Forgettable?

Creating a new distribution

The main concept used in Forgettable is a distribution which is initialised with a name and a Redis client:

popular_guitars = ForgetTable::Distribution.new(name: "guitars", redis: redis)
Incrementing a bin

A distribution is a container of "bins", i.e., items we want to track. In order to insert a new item we just use the increment method and pass the name of the bin we want to increment and the amount:

popular_guitars.increment(bin: "fender", amount: 100)

If not specified, the amount defaults to 1:

popular_guitars.increment(bin: "gibson")
Getting the probability distribution

Once bins are inserted in the distribution we can fetch the list of bins sorted by popularity:

popular_guitars.distribution
=> ["fender", "gibson"]

Weights for the bins can be retrieved by setting the optional argument with_scores to true:

popular_guitars.distribution(with_scores: true)
=> [["fender", 63.0], ["gibson", 1.0]]
Getting the probability for a given bin

You can also retrieve the score for a single bin:

popular_guitars.score_for_bin("fender")
=> [30]
Configuring the decay rate

The decay rate is a float number representing "how fast" the score for an item will go down. The lower the decay rate the slowest will be the decrement.

The decay rate can be configured using the following option:

ForgetTable::Configuration.decay_rate = 0.01

If not specified this value falls back to the default one.

References

=========

This software is release under the MIT license.