A/B testing mistakes I learned the hard way
Contents
Running experiments is equal parts powerful and terrifying.
Powerful because you can validate changes that will transform your product for the better; terrifying because there are so many ways to mess them up.
I’ve run hundreds of A/B tests, both in my previous life as a growth engineer at Meta, and on my personal side project.
These are some classic mistakes I’ve learned the hard way and how to avoid them.
1. Including unaffected users in your experiment
The first common mistake in A/B testing is including users in your experiment who aren't actually affected by the change you're testing. It dilutes your experiment results, making it harder to determine the impact of your changes.
Say you're testing a new feature in your app that rewards users for completing a certain action. You mistakenly include users who have already completed the action in the experiment. Since they are not affected by the change, any metrics related to this action do not change, and thus the results for this experiment may not show a statistically significant change.
To avoid this mistake, make sure to first filter out ineligible users in your code before including them in your experiment. Below is an example of how to do this:
2. Only viewing results in aggregate (aka Simpson's paradox)
It's possible an experiment can show one outcome when analyzed at an aggregated level, but another when the same data is analyzed by subgroups.
For example, suppose you are testing a change to your sign-up and onboarding flow. The change affects both desktop and mobile users. Your experiment results show the following:
Variant | Visitors | Conversions | Conversion Rate |
---|---|---|---|
Control | 5,000 | 500 | ✖ 10% |
Test | 5,000 | 1,000 | ✔ 20% |
At first glance, the test variant seems to be the clear winner. However, breaking down the results into the desktop and mobile subgroups shows:
Device | Variant | Visitors | Conversions | Conversion Rate |
---|---|---|---|---|
💻 Desktop | Control | 2,000 | 400 |