Mutually exclusive experiments
This topic describes how to use mutually exclusive experiments and prevent interaction effects that could invalidate your results. Use exclusion groups to ensure that users do not see overlapping experiments that apply to the same flag.
How exclusion groups work
For experiments that are not mutually exclusive, Optimizely Feature Experimentation uses a unique value for each experiment to bucket the user. The unique value determines whether a user enters a particular experiment. Because this value is random, unique and not mutually exclusive across experiments, some users may enter multiple experiments. For more information, view the documentation on how bucketing works.
For example, imagine two experiments: A and B. Each receives 20% traffic allocation (the percentage of total traffic that is eligible for the experiment).
Here is the expected traffic allocation:
- 16% of traffic falls in experiment A only
- 16% of traffic falls in experiment B only
- 4% of traffic falls in both experiment A and experiment B
- 64% of traffic is not in any experiment
In the example above, experiment A's and experiment B's results may be skewed. If users who see both A and B behave differently from those who see just A or B, then the results for A and B are skewed by the overlap. This is called an interaction effect.
If experiments A and B are mutually exclusive, Optimizely Feature Experimentation chooses the same random value (unique to the exclusion group) to bucket users in experiments A and B. This method ensures that experiments cannot overlap for the same users. If experiments A and B are mutually exclusive, the traffic allocation looks something like this:
- 20% of traffic falls in experiment A only
- 20% of traffic falls in experiment B only
- 60% of traffic is not in any experiment
Optimizely Feature Experimentation also ensures mutual exclusivity between experiments in an exclusion group that run at different times.
Best practices
To guard against any possibility of interaction effects, you might consider making all your experiments mutually exclusive. But, sometimes, making all experiments in the project mutually exclusive requires more traffic than is available. Depending on the traffic levels you need to reach significance and which parts of your code base are being tested, we recommend that some experiments overlap and some experiments be mutually exclusive.
You are more likely to see interaction effects if:
- You are running two experiments on the same area of an application.
- You are running two experiments on the same flow where there will likely be a substantial overlap.
- You are running an experiment that may have a significant impact on a conversion metric that is shared with other experiments.
If these points do not apply, creating mutually exclusive experiments is usually unnecessary. Both variations of the experiments are exposed to the other experiment proportionally.
However, there are a few scenarios when creating mutually exclusive experiments or running sequential experiments (waiting for one to end for the next to start) is recommended. Even if you ensure that experiments are mutually exclusive, it is still possible to see interaction effects from experiments running at different times. After you have experimented on some population of users, it is never possible to get a truly unbiased population for future experiments. If you are concerned about interaction effects between experiments running at different times, finish all experiments in an exclusion group before creating new experiments.
For example, suppose you created an exclusion group with four experiments (A, B, C, and D), running at 25% traffic allocation each. Suppose you stop experiment D and start another experiment, E. In that case, the experiment results of E could be biased because all users in E were previously given the treatment from D. Wait for experiments A, B, and C to finish before starting experiment E to ensure the traffic is evenly sampled across all previous experiments.
When making important decisions for your business, evaluate your risk tolerance for experiment overlap. Evaluate your prioritized roadmap to ensure that you are planning your variation designs, goals, and execution schedule to best meet your business needs.
More information
Updated over 1 year ago