The two purposes of babbab
are:
- To be the simplest tool for Data Analysts/Statisticians to analyze A/B tests.
- To return the simplest results for Stakeholders/Non-Statisticians to understand.
babbab
an acronym of BAyesian Beta-Binomial A/B testing (BaBBAB
), but it's spelled in lowercase (babbab
) because it doesn't like shouting.
This should work in vanilla Python +3.8.
pip install babbab
Lets say we sell subscriptions to a paper magazine and want to conduct a simple A/B test.
We want to change the background color of our app from grey to green because we want to know if changing the background color will increase sales. To do so, we assign 50% of our users to the new app design with a green background (The Variant Group), while other 50% stay in the old grey design (the Control group). We managed to pull these 4 numbers out our tracking into Python:
control_sold_subscriptions = 200
control_users = 40316
variant_sold_subscriptions = 250
variant_users = 40567
Because babbab
is awesome you can just run:
import babbab as bab
plot, statement, trace = bab.quick_analysis(control_sold_subscriptions,
control_users,
variant_sold_subscriptions,
variant_users)
And get everything you need.
- In
plot
you will find a matplotlib figure. You can change the title and labels in thequick_analysis
function. - In
statement
, you will get a string that is intended to be interpreted verbatim by Non-Statisticians. - In
trace
, you will get an arviz InferenceData object, in case you want to explore the run further.
In the signature of quick_analysis
you can configure the statistics and the aesthetics of most of this.
A/B tests (or controlled experiments) are an increasingly popular way of incrementally improving websites, desktop, and mobile apps. At Multilayer we have analyzed probably hundreds, with a miriad of different tools and statistical methodologies.
In our experience, when companies A/B tests the biggest problems they encounter are around interpreting the results and acting appropiately on them. There are plenty of statistical libraries out there that do A/B testing right (babbab actually uses PyMC in the background). However, sharing statistics (like p-values) with non-statisticians can lead to confusion and misuse of results.
What babbab
tries to cover is the "last mile" of the A/B test analysis: Interpreting and communicating the results for them to be actionable.
- Get 4 numbers in, get a statistically valid statement that you can repeat to your manager verbatim, and a plot you can understand.
- Get 4 numbers in + some labels, and you will get the above and a plot you can share and a statement you can C&P in the company chat.
- Add a bit more work, and you have your own custom built AB testing dashboard/tool.
Stop worrying about your peers and yourself misinterpreting stats.
Still a lot to basic docs to do.
- Add example results (plot, statement) to the README
- Add example with labels to README
- Add docstrings
Maybe?
- Sphinx or RTD Documentation