# ABAnalyzer¶ ↑

<img src=“https://secure.travis-ci.org/bmuller/abanalyzer.svg?branch=master” alt=“Build Status” />

ABAnalyzer is a Ruby library that will perform testing to determine if there is a statistical difference in categorical data (typically called an A/B Test). By default, it uses a G-Test for independence, but a Chi-Square test for independence can also be used.

## Installation¶ ↑

Simply run:

gem install abanalyzer

## Basic Usage¶ ↑

The simplest test (which uses a gtest):

require 'abanalyzer' values = {} values[:agroup] = { :opened => 100, :unopened => 300 } values[:bgroup] = { :opened => 50, :unopened => 350 } tester = ABAnalyzer::ABTest.new values # Are the two different? Returns true or false (at 0.05 level of significance) puts tester.different?

## Multiple Categories¶ ↑

You can use the ABAnalyzer module to test for differences in more than two categories. For instance, to test accross three:

require 'abanalyzer' values = {} values[:agroup] = { :male => 200, :female => 250 } values[:bgroup] = { :male => 150, :female => 300} values[:cgroup] = { :male => 50, :female => 50 } tester = ABAnalyzer::ABTest.new values # Are the two different? Returns true or false (at 0.05 level of significance) puts tester.different?

## Tests Available¶ ↑

You can get the actual p-value for either a Chi-Square test for independence or a G-Test for independence.

... tester = ABAnalyzer::ABTest.new values puts tester.chisquare_p puts tester.gtest_p

You can additionally get the actual score for either a Chi-Square test for independence or a G-Test for independence.

... tester = ABAnalyzer::ABTest.new values puts tester.chisquare_score puts tester.gtest_score

## Sample Size Calculations¶ ↑

Let’s say you want to determine how large your sample size needs to be for an A/B test. Let’s say your baseline is 10%, and you want to be able to determine if there’s at least a 10% relative lift (1% absolute) to 11%. Let’s assume you want a power of 0.8 and a significance level of 0.05 (that is, an 80% chance of that you’ll succeed in recognizing a difference when there is one, and a 5% chance of a false negative).

... ABAnalyzer.calculate_size(0.1, 0.11, 0.05, 0.8) => 14751

This means that you will need at least 14,751 people in each group sample. You can see this same example with R at on the 37 signals blog.

## Confidence Intervals¶ ↑

You can also get a confidence interval. Let’s say you have the results of a test where there were 711 successes out of 4000 trials. To get a 95% confidence interval of the “true” value of the conversion rate, use:

... ABAnalyzer.confidence_interval(711, 4000, 0.95) => [0.1659025512617185, 0.1895974487382815]

This means (roughly) that if you ran this experiment over and over, 95% of the time the resulting proportion would be between 17% and 19%.

You can also determine what the relative confidence intervals would be. Let’s say that your old conversion rate was 13%, and you wanted to know what sort of relative lift you could get.

... ABAnalyzer.relative_confidence_interval(711, 4000, 0.13, 0.95) => [0.27617347124398833, 0.45844191337139606]

This means (roughly) that if you ran this experiment over and over, 95% of the time the resulting proportion would be a relative lift of between 28% and 46%. Go buy yourself a beer!

## Running Tests¶ ↑

Testing can be run by using:

bundle exec rake