Optimizing Performance on Optimizely DXP

CDN settings

The CDN servers are designed to cache content according to the cache rules you define in the HTTP headers from your application. You must set these caching rules correctly for your solution to scale properly. See CDN recommendations.

Site startup

Proper site warm-up is crucial for cloud-based environments. Because a node may be brought out for maintenance at any time and then put back in during peak hours, a node that gets a full share of traffic without being warmed up first will cause response-time spikes and increase the risk of outages.

The warm-up feature automatically starts and initializes a web application to prepare the server and data caches. See Initialization.

Limiting the number of content types is a good practice. Startup scans assemblies and caches views, so a large number (200+) of content types significantly affects the startup time. You should also keep the web app below 1 GB. This includes binaries but not media assets and logs that should be written to a BLOB storage container.

Auto-scaling and instance health

Optimizely CMS (PaaS) manages auto-scaling and instance health independently through two built-in mechanisms.

Auto-scaling

Resource thresholds, such as CPU usage, trigger auto-scaling. When traffic increases, Optimizely adds instances to handle the load. When traffic subsides, Optimizely scales back down to the configured default instance count.

Health monitoring

Health monitoring runs in parallel. Optimizely continuously checks instance health by sending HTTP requests to a designated health endpoint. Instances that fail these checks are temporarily removed from the load balancer for up to one hour. During that time, Optimizely attempts an automatic replacement. A platform-enforced daily limit caps how many instances Optimizely can replace each day.

Auto-scaling and health checks operate independently. Scaling decisions are not influenced by how many instances are currently unhealthy.

The health check detects infrastructure-level failures. Application-level errors, such as a 404 on a specific path, are not treated as instance health signals and do not trigger replacement. To extend the health check with custom logic, see Customize health checks.

When persistent issues occur

Persistent issues that automatic instance replacement does not resolve usually indicate an application-level problem. These require investigation rather than an infrastructure response.

Output cache

Cloud-based solutions are more likely to scale out the web servers rather than to scale them up. This means that each front-end node also contributes to a constant load to the database. In other words, if you go from two front-end servers (a typical on-premises setup) to four front-end servers while keeping the total throughput the same, the load on the database server increases.

When scaling out, be sure that the machines that spend the most effort building a page are the front-end servers. Caching in multiple layers (object caches, partial HTML caches such as for complex menus, or full output cache) helps avoid a "cache stampede," especially when combined with a warm-up.

By default, when a page is published, output caches are immediately invalidated for sites. This causes output-cached pages to be re-rendered using the lower-level caches. Most of these lower-level caches remain valid after a publish, except for the published page caches. Implement proper multi-layer or partial caching for rendered pages with heavy data processing. See Caching.

Entity tags (ETags)

The ETag or entity tag is part of the HTTP protocol and determines web cache validation. See Configure cache headers for information about using ETags.

Resilience

In a cloud environment, retry policies become increasingly important. Transient errors may occur due to network issues or maintenance of infrastructure elements, and retry policies let the application gracefully recover from such errors without propagating the error to the end user.

Retry mechanisms for Azure services differ because each service has its own requirements and characteristics. Each retry mechanism is tuned to a specific service. See Microsoft documentation for guidelines.

Storage

Because virtual machines hosting a web app may be restarted anytime, you risk losing any information stored in the file system. Also, if you have large media volumes, you should store assets in a BLOB storage instead of in the Web App because this limits scalability. Optimizely provides access to BLOB storage through a BlobProvider interface.

📘
Note
Some third-party components such as Lucene.NET that use file shares or files local to the web server, may have problems with high traffic in a cloud environment, and are therefore not supported.

DXP performance

CDN settings

Site startup

Auto-scaling and instance health

Auto-scaling

Health monitoring

When persistent issues occur

Output cache

Entity tags (ETags)

Resilience

Storage

Note

Next