Mutually exclusive experiments

How to use exclusion groups to prevent interaction effects that could invalidate your results by ensuring that users do not see overlapping experiments that apply to the same flag.

Suggest Edits

For experiments that are not mutually exclusive, Optimizely Feature Experimentation uses a unique value for each experiment to bucket the user. The unique value determines whether a user enters a particular experiment. Some users may enter multiple experiments because this value is random, unique, and not mutually exclusive across experiments. For information, see how bucketing works.

Example

For example, there two experiments: experiment A and experiment B. Each experiment receives 20% traffic allocation (the percentage of total traffic that is eligible for the experiment). The traffic allocation would be

16% of traffic falls in experiment A only.
16% of traffic falls in experiment B only.
4% of traffic falls in both experiment A and experiment B.
64% of traffic is not in any experiment.

In the example above, experiment A's and experiment B's results may be skewed. If users who see both A and B behave differently from those who see just A or B, then the results for A and B are skewed by the overlap. This is called an interaction effect.

If experiments A and B are mutually exclusive, Optimizely Feature Experimentation chooses the same random value (unique to the exclusion group) to bucket users in experiments A and B. This method ensures that experiments cannot overlap for the same users. If experiments A and B are mutually exclusive, the traffic allocation looks something like this:

20% of traffic falls in experiment A only.
20% of traffic falls in experiment B only.
60% of traffic is not in any experiment.

Optimizely Feature Experimentation also ensures mutual exclusivity between experiments in an exclusion group that run at different times.

Interaction effect

To minimize the risk of interaction effects, you may consider making some or all of your experiments mutually exclusive. However, requiring all experiments to be mutually exclusive can sometimes require more traffic than is available. Balancing between traffic needs and statistical significance, it is often more practical to let certain experiments overlap while keeping others mutually exclusive.

You are more likely to see interaction effects when you are

Testing the same area of the application – You are running two experiments targeting the same feature or section of your application.
Testing the same user flow – You are running experiments that affect the same flow or user journey, leading to significant overlap in user exposure.
Testing shared conversion metrics – One experiment has a large potential impact on a key conversion metric, which is also being tracked by other experiments.

When not to use mutual exclusion

If none of the preceding conditions apply, creating mutually exclusive experiments is usually unnecessary. In such cases, both variations of overlapping experiments proportionally, with minimal interaction risk.

When to use mutual exclusion

There are situations where mutually exclusive experiments or running experiments sequentially (waiting for one to conclude before starting another) is recommended. These include when there is a high risk of interaction effect due to overlapping functionality or metrics critical to decision-making.

When making important business decisions, evaluate your risk tolerance for experiment overlap. Evaluate your prioritized roadmap to ensure that you are planning your variation designs, goals, and execution schedule to best meet your business needs.

Best practices

Make experiments mutually exclusive only when required to prevent restricting traffic more than necessary.
Experiments within an exclusion group do not need to start or stop simultaneously, but they must all be part of the exclusion group from the moment the first experiment starts until the last experiment finishes to maintain a fixed traffic allocation and visitor bucketing.
No experiments should be added to the group after one of them has been started. If experiments are added to the group while any of the group's existing experiments are already running, traffic allocation to the experiments will be shifted, and visitors may be exposed to multiple experiments after all.
- For example, a visitor is initially bucketed into experiment A and has already been exposed to it. Then, experiment B gets added to the exclusion group. Traffic allocation will shift, and this visitor may now be assigned to experiment B while they no longer see experiment A.
Do not remove running experiments from an exclusion group or reallocate traffic in ways that could introduce bias or overlap between experiments.
- For example, an exclusion group contains three experiments. A visitor is initially bucketed into experiment B. Then, experiment C is removed from the group, leaving only experiments A and B. With the bucketing ranges now being shifted, the visitor may get bucketed from experiment B into experiment A.

Implementation

See Use mutual exclusion.

Updated 9 months ago