tea-tasting

A Python package for the statistical analysis of A/B tests.

Analytics

statistics

delta-method

cuped

ab-testing

tea-tasting: statistical analysis of A/B tests

tea-tasting is a Python package for the statistical analysis of A/B tests featuring:

Student's t-test, Z-test, bootstrap, and quantile metrics out of the box.
Extensible API: define and use statistical tests of your choice.
Delta method for ratio metrics.
Variance reduction using CUPED/CUPAC (which can also be combined with the delta method for ratio metrics).
Confidence intervals for both absolute and percentage changes.
Sample ratio mismatch check.
Power analysis.
Multiple hypothesis testing (family-wise error rate and false discovery rate).

tea-tasting calculates statistics directly within data backends such as BigQuery, ClickHouse, DuckDB, PostgreSQL, Snowflake, Spark, and many other backends supported by Ibis. This approach eliminates the need to import granular data into a Python environment.

tea-tasting also accepts dataframes supported by Narwhals: cuDF, Dask, Modin, pandas, Polars, PyArrow.

Installation

pip install tea-tasting

Basic example

>>> import tea_tasting as tt

>>> data = tt.make_users_data(seed=42)
>>> experiment = tt.Experiment(
...     sessions_per_user=tt.Mean("sessions"),
...     orders_per_session=tt.RatioOfMeans("orders", "sessions"),
...     orders_per_user=tt.Mean("orders"),
...     revenue_per_user=tt.Mean("revenue"),
... )
>>> result = experiment.analyze(data)
>>> print(result)
            metric control treatment rel_effect_size rel_effect_size_ci pvalue
 sessions_per_user    2.00      1.98          -0.66%      [-3.7%, 2.5%]  0.674
orders_per_session   0.266     0.289            8.8%      [-0.89%, 19%] 0.0762
   orders_per_user   0.530     0.573            8.0%       [-2.0%, 19%]  0.118
  revenue_per_user    5.24      5.73            9.3%       [-2.4%, 22%]  0.123

Learn more in the detailed user guide. Additionally, see the guides on data backends, power analysis, multiple hypothesis testing, and custom metrics.

Roadmap

A/A tests and simulations.
More statistical tests:
- Asymptotic and exact tests for frequency data.
- Mann–Whitney U test.
Sequential testing.

Package name

The package name "tea-tasting" is a play on words that refers to two subjects:

Lady tasting tea is a famous experiment which was devised by Ronald Fisher. In this experiment, Fisher developed the null hypothesis significance testing framework to analyze a lady's claim that she could discern whether the tea or the milk was added first to the cup.
"tea-tasting" phonetically resembles "t-testing" or Student's t-test, a statistical test developed by William Gosset.