Run a multi-armed bandit optimization

How to run an multi-armed bandit optimization in Optimizely Full Stack.

You might want to run a test that focuses on maximizing conversions from your variations instead of finding the variation most likely to consistently perform better than your baseline. A multi-armed bandit (MAB) optimization is a different type of experiment, compared to an A/B test, because it uses reinforcement learning to allocate traffic to variations that perform well while allocating less traffic to underperforming variations.

️MAB and statistical significance

MAB optimizations do not generate statistical significance. Instead, the algorithm pushes traffic to variations that have the most conversions; the reason for a variation's performance is not important.

MAB tests are best-suited for maximizing conversions during short, temporary experiences such as headline testing or a holiday weekend sale. You should never use MAB tests for exploratory hypotheses or for variation selection. MAB is for optimization, not experimentation.

MAB's primary goal is to answer: Which variation gets us the largest reward? where the "largest reward" is the highest revenue or most conversions. For more information on MABs, see how to Maximize lift with multi-armed bandit optimizations.

Best Use Cases

The following cases may be a better fit for a multi-armed bandit optimization than a traditional A/B experiment:

  • Promotions and offers. – Users who sell consumer goods on their site often focus on driving higher conversion rates. One way to do this is to offer special promotions that run for a limited time, so your changes will not be permanent, and a MAB optimization will send more traffic to the over-performing variations and less traffic to the underperforming variations for the duration of the promotion.
  • Headline testing. – Headlines are short-lived content that lose relevance after a fixed time. If a headline experiment takes as long to reach statistical significance as the lifespan of a headline, then insights gained from the experiment are irrelevant. Therefore, a MAB optimization lets you maximize your impact without balancing experiment runtime and the natural lifespan of a headline.
  • Webinars. – You can boost registration for webinars or other events by experimenting with several different versions of calls to action to sign up for your webinar.

For algorithmic details of MABs at Optimizely, see the support documentation.

Setup overview

To configure a MAB:

  1. (Prerequisite) Create a flag.

  2. (Prerequisite) Handle user IDs.

  3. Create and configure a MAB rule in the Optimizely application.

  4. Implement the Optimizely SDK's Decide method in your application's codebase through a feature flag if you have not done so yet.

  5. Test your MAB rule in a development environment. See Test and troubleshoot.

  6. Discard any test user events and enable your MAB optimization rule in a production environment.

Create an optimization in the Optimizely application

To create a new optimization in the Optimizely app:

  1. Go to Flags, select your flag and select your environment (Development or Production).

  2. Click Add Rule and select Multi-Armed Bandit.

  1. Configure your MAB rule:
    1. (Optional) Search for and add audiences. To create an audience, see Target audiences. Audiences evaluate in the order in which you drag and drop them. You can choose whether to match each user on any or all of the audience conditions.
    2. Set the Percentage included slider to allocate the percentage of your audiences to bucket into the experiment.
    3. Add metrics based on tracked user events. See Create events to create and track events. For more information about selecting metrics, see Choose metrics.
    4. Choose the variations you want to optimize. Unlike A/B experiments, you do not need to compare to a baseline experiment because statistical significance is not calculated with MAB optimizations. See Why MABs do not use a baseline.
    5. (Optional) Add the MAB to an Exclusion Group.
    6. Click Save.



If you plan to change the traffic allocation after starting the experiment, then implement a user profile service. See Ensure consistent user bucketing. Also, create a user profile service if you plan on using the Stats Accelerator.

  1. Toggle the Flag On.

Implement the MAB

If you have already implemented the feature flag in your application's codebase, no further configuration is required for the flag delivery. If you have not, implement the Decide method call in your code to enable or disable the flag for a user:

// Decide if user sees a feature flag variation
let user = optimizely.createUserContext(userId: "user123", attributes: ["logged_in":true])
let decision = user.decide(key: "flag_1")
let enabled = decision.enabled
// Decide if user sees a feature flag variation
user := optimizely.CreateUserContext("user123", map[string]interface{}{"logged_in": true})
decision := user.Decide("flag_1", nil)
enabled := decision.Enabled
# Decide if user sees a feature flag variation
user = optimizely.create_user_context("user123", {'logged_in': True})
decision = user.decide("flag_1")
enabled = decision.enabled
// Decide if user sees a feature flag variation
$user = $optimizely_client->createUserContext('user123', ['logged_in' => true]);
$decision = $user->decide('flag_1');
$enabled = $decision->getEnabled();
# Decide if user sees a feature flag variation
user = optimizely_client.create_user_context('user123', {'logged_in' => true})
decision = user.decide('flag_1')
enabled = decision.enabled
// Decide if user sees a feature flag variation
var user = optimizely.CreateUserContext("user123", new UserAttributes { { "logged_in", true } });
var decision = user.Decide(key: "flag_1");
var enabled = decision.Enabled;
// Decide if user sees a feature flag variation
OptimizelyUserContext user = optimizely.createUserContext("user123", new HashMap<String, Object>() { { put("logged_in", true); } });
OptimizelyDecision decision = user.decide("flag_1");
Boolean enabled = decision.getEnabled();
// Decide if user sees a feature flag variation
var user = optimizelyClient.createUserContext('user123', { logged_in: true });
var decision = user.decide('flag_1');
var enabled = decision.enabled;
// Decide if user sees a feature flag variation
var decision = useDecision('flag_1', null, { overrideUserAttributes: { logged_in: true }});
var enabled = decision.enabled;

See the following for more detailed examples.

Optimizely uses the Decide method call to decide if a user qualifies for the delivery rule or not and which variation they receive. Optimizely SDKs let you reuse the exact flag implementation for different flag rules.

Remember, a user evaluates against all the rules in a ruleset in order before being bucketed into a rule's variation. See Create feature flags.

Did this page help you?