Experiment statistics overview

Last updated:

|Edit this page

A working understanding of statistical methodology is helpful to feel confident about interpreting experiment results. For those without prior experience, this overview explains everything you need to know in layperson's terms. For those with some statistics experiments, this overview documents our methodology and assumptions.

Experiments use Bayesian statistics to determine whether a given variant performs better than the control. It quantifies win probabilities and credible intervals, helps determine whether the experiment shows a statistically significant effect, and enables you to:

  • Check results at any time without statistical penalties.
  • Get direct probability statements about which variant is winning.
  • Make confident decisions earlier with accumulating evidence.

This contrasts with Frequentist statistics, which requires you to predefine sample sizes and prevents you from updating probabilities as new data arrives.

Example Bayesian analysis

Say you started an experiment a few hours ago and see these results:

  • 1 in 10 people in the control group complete the funnel = 10% success rate.
  • 1 in 9 people in the test variant group complete the funnel = 11% success rate.
  • The control variant has a 46.7% probability of being better and the test variant has a 53.3% probability of being better.
  • The control variant shows a credible interval of [2.3%, 41.3%] and the test variant shows a credible interval of [2.5%, 44.5%].

The first two values are pure math: dividing the number of successes by the total number of users gives us the raw success rates. It's not enough to just compare these conversion rates, however.

The last two values are derived using Bayesian statistics and describe our confidence in the results. The win probability tells you how likely it is that a given variant has the highest conversion rate compared to all other variants in the experiment. The credible interval tells you the range where the true conversion rate lies with 95% probability.

Importantly, even though the test variant is winning, it doesn't clear our threshold of 90% or greater win probability to to be a statistically significant conclusion. This uncertainty is also demonstrated with the amount of overlap in the credible intervals.

As such, you decide to let the experiment run a bit longer and see these results:

  • 100 in 1000 people in the control group complete the funnel = 10% success rate.
  • 100 in 900 people in the test variant group complete the funnel = 11% success rate.
  • The control variant has a 21.5% probability of being better and the test variant has a 78.5% probability of being better.
  • The control variant shows a credible interval of [8.3%, 12%] and the test variant shows a credible interval of [9.2%, 13.3%].

Et voilà! The additional data increased the win probability and narrowed the credible intervals. With 1,900 total users instead of just 19, random chance becomes a less likely explanation for the difference in conversion rates. Even though both variants maintained the same conversion rates (10% vs 11%), the larger sample size gives us more confidence in the experiment results.

At this point, you could either declare the test variant as the winner (78.5% probability), or continue collecting data to reach the 90% statistical significance threshold. Bayesian statistics let you check your results whenever you'd like without worrying about increasing the chance of false positives from checking too frequently.

Supported metric types

Experiments support a few different types of metrics, and each metric type uses a model appropriate to the shape of its data.

For example, funnel conversions are always between 0% and 100%, pageview counts can be any positive number (0, 50, 280), and property values can vary widely and tend to be right-skewed.

The following explain how Bayesian statistics is applied to each type of metric:

If your experiment was created prior to January 2025, it is evaluated using the legacy methodology.

Questions?

Was this page useful?

Next article

Statistical methodology for funnel metrics

Funnel metrics use Bayesian statistics with a beta model to evaluate the win probabilities and credible intervals . Read the statistics overview if you haven't already. What is a beta model? Imagine you run a pizza shop and want to know if customers say "yes" to adding pineapple. Some customers will say yes, others will say no. Knowing what percentage of customers want pineapple on their pizza helps you decide how much to order and what options to offer. The beta model is a statistical…

Read next article