Environment Management

Describes a guide to improved Optimizely Configured Commerce performance.

Optimizely Configured Commerce is a large and complex set of modules and APIs offering a tremendous depth of capabilities for business-to-business distributors and manufacturers. While we strive to ensure performance is one of the key abilities we focus on, it is also balanced against maintainability of the code, modularity of our tooling, testability and having the ability to move quickly. Software always includes some set of tradeoffs to deliver maximum value. The intent of this document is to outline some best practices and some pitfalls to watch out for when developing your Configured Commerce solution. This is by no means an exhaustive list of things to watch out for but should provide a good baseline and approach for improving overall performance of your site.

Types of performance bottlenecks

There are many different types of areas that could cause performance issues but they tend to fall into one of several general categories:

Front end performance – these are the types of problems that comes from having bad Javascript code, too many libraries to download and very large images. These problems tend to hit every user the same but will be exacerbated by slow connections such as on a mobile device running a responsive website.
API performance – these are problems that can crop up due to coding issues in the APIs, looping issues (that is running the same code over and over), or database issues. These can crop up either for a single user or get worse as load increases. These problems may be in base code potentially or in custom code.
3rd Party Interactions – this class of problem typically stems from using an external provider such as in the case of real-time calls to the ERP for pricing. These are often the most difficult to manage since they tend to be out of Insite's or the Developer's control. There are, however, some strategies to employ.

General approach

The approach for performance falls into three broad categories:

Planning – It is never too early to think about performance. Thinking through the approach to given customizations or even implementation options should consider performance impacts. For example, while one could leverage Promotions to handle some specific type of pricing issue, if there will end up being a promotion for every product (resulting in 10,000+ promotions), then it will be far better to determine how to resolve the issue within the pricing pipeline and NOT leverage the promotions engine. It is always going to be more effective to anticipate and plan for a bottleneck than to resolve it once it is discovered.
Detection – The next level is to identify where the problems exist. There are many tools and approaches to determining where a problem exists. Most of these are generic and not specific to Optimizely so we will not dive too deeply into the exact tooling.
Resolution – Finally, implementing a change to the approach or code will be the ultimate way to resolve a performance problem. It will be important to try and establish a baseline for a particular problem to see if the proposed resolution will actually address the problem detected.

When developers do their work, they are typically working with small datasets which can hide performance problems that will be exposed in a production environment. Internally, we use a Performance database that has significantly more data and introduces some better edge cases for testing. We have a set of tests, that are enhanced over time, that exercise the system by using synthetic transactions against this Performance database. See the chart at the end of this document with the number of records we use for testing if your specific instance has a significantly larger number of records than we test with, review these areas for performance. It will be likely that, for example, if we test with 50 active promotions and you have 1,000 active promotions, that there could be a significant performance issue.

The rest of this document will focus on common areas where performance issues can crop up and how to attempt to identify and/or resolve them.

Front-end design

The following are areas that we have found that can be tuned for performance. We use Chrome tools and Web Page Test (http://www.webpagetest.org/) for testing the theme and front end. Since our application is a single page app (SPA), we load all the JavaScript libraries on the first page load. This can take a hit on the site but especially if a lot of additional JS libraries are added to the project. These are not in any specific order of importance.

Too much on the home page - Remembering that the home page takes a big hit on the JavaScript front and is the first page that users see, it is important to have the best imagery and loading strategy on that page specifically. Adding too many large images or data controls will slow that page down.
Large Images - It is important to optimize the images to the size being displayed. The source file being large is fine if you are resizing that image to the optimal display size for the actual site rather than letting the browser.
Large Javascript Libraries - Try to leverage the existing libraries whenever you can. The current site uses libraries which will typically have the capabilities you want. Make sure you are intentional about adding additional libraries and try not to add large libraries to do a single function.
Minimize Fonts - Font files are quite large. If you want an icon from a font, consider isolating that component rather than downloading the entire font.

Caching

Caching is the single biggest tool you can leverage to improve performance but it must be used judiciously. Configured Commerce makes heavy use of caching throughout the application but you must opt into the various settings and then implement caching strategies for custom code.

eTag Caching -This option is used to determine the state of an API request on the server so if a subsequent request is made, it will return a status 304 (not modified) indicating that the data has not changed. This prevents the server from having to recalculate the data or marshal it back from the server to the web client, additionally improving performance.
Cache Manager - Developers can make use of the cache manager to cache important data that does not change with great frequency. Consideration must be given to how much data is stored in the cache to prevent automatic cache eviction with too much data. Additionally, the developer needs to determine how long the cache should live against the cost of reloading that cache.
Shared (distributed) Cache - Configured Commerce also has a distributed cache capability through either Redis or SQL Server. This is currently used primarily for real-time inventory and pricing calls to minimize the number of calls made to the ERP. Since cache is stored in memory on each web server, using the distributed cache will first try to load the data from server member and, if not available, will attempt to load it from the shared cache and cache it locally. This layered cache approach can make very significant improvements in overall performance.
Cache Settings - Make sure to turn on the cache settings (CMS Content, Category Menu) and set the refresh minutes as high as is reasonable. Typically caching is disabled or a low value put in place during testing so that changes are seen more quickly.

Database Access

The largest area for performance issues will typically come from calls to the database. This can be either too many calls, retrieving too much data or poor queries. The shape of the data will often impact the general performance of the database server and since each site is different, it is difficult to give specific direction as to how to detect and resolve these items.

General Approach – The best way to determine performance issues is to use a tool like dotTrace to see the specific calls and how long they take coupled with monitoring a local copy of the database using SQL Server Profiler. Any calls that are taking longer than, say 300ms, are targets for additional research.
Query Format – Take special care when looking at LINQ queries to make sure they are just getting the data you want.
Index Hits – As you profile the site locally, you may notice specific queries are table scanning this happens more frequently with a large number of joins. There are a number of techniques that can be deployed including writing direct SQL queries (instead of LINQ), creating a stored procedure, or even creating a supporting table with the data you need. If there is an index missing on standard Configured Commerce data, feel free to submit that as a ticket to support and Engineering will consider adding indexes for performance. Sometimes, as we have found, adding an index fixes a problem in one site and makes it much worse on another site this all depends on the specific data.
Expands – Each of the APIs generally has options to expand and retrieve related data. Since these always represent additional database calls, only ask for those things out of the API you actually need.

Areas to Monitor

The following are a series of specific areas you should look at or watch out for relative to performance.

Products – Products are the largest and most important construct in the system and is the most expensive resource to retrieve. As you design, make sure to get the products you need but realize that the more products you retrieve, the slower the system will be.
Try to limit the number of entries in any given cross-sell/accessory/related product widget since each must retrieve the product and often needs to calculate inventory and pricing as well.
Keep the default page limit on the product list and search results page as low as feasible (default = 8) so that too many products do not have to be retrieved concurrently. This, of course, should be balanced against the overall needs of the site. The idea here is not to design the system to retrieve, say, 50 or 100 products concurrently.
Variant Products when designed, we expected these products to have 2-4 traits which the standard design supports. Having, say 10+ traits could present some display and interaction impacts on the product detail page. Additionally, if you have more than 40 variants on a product, there may be issues in pricing and inventory that must be addressed.
Custom Properties – Custom Properties are a very handy way to extend the data model for a given entity since they are generic and configurable. Each one is represented in the database as a name/value pair so the records are compact. Specifically when related to the Product entity, you must weigh the convenience of custom properties against performance concerns. For most entities, the system is not constantly reading different data so retrieving, say, 50 custom properties for a customer is likely not going to be a problem. Take those 50 properties to the Product or Category tables, for example, you will likely have some performance concerns. One option is to alter the ProductCollection pipeline to only retrieve those properties you actually need and another approach is to use a custom table for the product instead of custom properties.
Categories – The category structure should generally not be more than about 3 levels deep. Studies show that most people will begin with search and not navigate categories or, if they do, using filters such as the attribute filters is a better experience than going through endless levels of category taxonomy.
Customer Segments – These are another powerful capability of the system and we encourage you to use segmentation to best present your merchandising message. We would only caution that having too many segments can cause management of the system and content to be confusing. When you are on a page in the site and show page source, you will be able to see the list of assigned segments to the user (label = Personas).
Attribute Types/Values – Having an excess of attributes can make search and faceting take a bit longer and can render a results page with a very long list of attributes rending the user experience confusing. Make sure to think through what attributes you want to expose for faceting and comparing, in particular.
Calculating Tax/Shipping in Cart – Tax is often calculated externally by doing an API call to a tax service or the ERP to calculate the tax on the cart. This is another expensive operation. We have designed the system to not have to recalculate when going to the checkout page, but people tend to go to the cart far more frequently than they go to the checkout page. Shipping is the same we call out to each of the enabled services and that can be a hit to performance wait until you need to do it. This is controlled by a system setting.
Promotions – Promotions are a very powerful capability of the system. The way they are calculated, however, is that every active promotion is run, typically on the checkout page. While powerful given its metadata configuration approach, it also can be a bit slow. Having 10 or 15 concurrent promotions should present no significant problem but having 500+ will. The allow multiple promotions being set to NO is only of limited help since it only applies to the order the system must still check for all line level promotions.
Languages – The number of languages is not a significant issue in the system EXCEPT when building the Elasticsearch index. The system effectively builds a separate index for each active language so only activate the languages you require.
RealTime Services – It is common practice to use APIs to retrieve pricing and inventory from the ERP which is perfectly fine to do. If the ERP allows for multiple products to be priced and inventory comes back with the same API call, this is ideal. If the ERP requires a separate call per product and, additionally, requires an additional call for each inventory item, the latency will make the system very slow. In this case, we would suggest using a Refresh job for inventory and only relying on real-time calls for pricing.
Translations – Try to make sure to only create transaction records for translations. If you have multiple languages and use the Generate Records option in the Admin Console, the system will create many empty records which can slow down the system. If you do not use translations at all, for even better performance, disable the Enable Translation Properties option.

Code-Specific Things to Monitor

.ToList on LINQ queries, anything in a statement filter that can be converted to SQL must be included BEFORE you enumerate so you are retrieving only the data you need and do not overload the Entity Framework context. You do not want to filter data after the data is marshalled to the client.
Projections – Use projections if you don't need every field. For example, instead of First().Id, use .Select(o => o.Id).First(). You can project parts of objects using .Select(o => new { o.Id, o.CreateBy }) to extract just the Id and CreateBy columns. This reduces database load, network bandwidth, and EF processing time
.Any – Use .Any() instead of .Count() >= 1. .Any() stops after the first hit, while .Count processes every candidate.
Only Return Required Data – all of the Storefront APIs have the ability to designate the specific fields to bring back in the resultset by adding a query string of &select=field1,field2 . Remember that this reduces the payload but does not lighten the query load on the servers.
Repeating Queries – Do not run repeating queries inside a loop. Scaling is very poor; each query requires a network round trip, causing a linear growth in run-time with more loop iterations. Instead, query for all needed data outside the loop. This will perform better even if some of this data is not used.
Stored Procedures – in general, Entity Framework will run fine and allows for ease of upgrading, simplified code, and so on but there are times where performance is paramount. Using a stored procedure will outperform EF with the same queries, support multiple queries in a single network request, and can efficiently process large lists of input data via Table Valued Parameters. If you are unable to tune EF to deliver acceptable performance, (even after following all of the above advice), stored procedures are a possible solution.
Entity Framework tracking – when using EF, it tracks the state of all of its objects to ensure that if something is changed, it will be committed back to the database. This is very convenient and a key feature of an ORM but takes additional overhead. If you are retrieving data that you know does not need to be modified, use GetTableAsNoTracking() when retrieving that data. If you are not using that, be mindful of how many entities you are loading into the context you should limit it to no more than 100 and then call UnitOfWork.Save() and UnitOfWork.Clear() to limit the overhead of EF change tracking.

Performance Database

The following is a list of the current database we use for performance testing. You should confer with this as a guideline of the number of items in given tables that we test with and if your actual count is significantly greater, then it is an area you should pay particular attention to relative to performance. There is no guarantee that the shape of your data will be similar enough to this db such that things that are fast for our db are fast for yours or that if you have significantly more records in a given table that it would necessarily be a problem. This is provided simply as a guideline.

Keep in mind that these target datasizes are intended to exercise the system with reasonable loads as a baseline but it is not intended to exercise at limit. These numbers are based on existing sites and performance areas we want to be sure to test with.

Table	#/Records	Notes
Attribute Types	1,000
Attribute Values	50,000
Carriers	50
Categories	1,000
Content	10 variant homepages
Content Item	50,000 records
Customer	52,000	shipto 100 shiptos assigned to a single customer
Custom Property	720,000	10 each for customer and products
Customers -Salesperson	1,000 customers to a Sales person
Dealer	50 in the same geographical location
Document	25 assigned to a product
Experiment	10
Global Synonyms	250
HTML Redirects	250
Images	25 Images assigned to a product
Invoice History	100 for a single customer
Invoice History Line	100 for a single order
Language	3
Order History	100 for a single customer
Order History Line	100 for a single order
Persona	10	Customer Segments
Product	165,000
Product Unit of Measure	Units of Measure 10 UoM assigned to a product
Product Attribute Values	25 attributes for a single product
Product Cross Sells	100 for a product
Promotions	75
Restriction Groups	5,000
Restriction Group Customer	6,900
Restriction Group Product	190,000
Section	10 sections with 10 options per sections	Use for configured products
Specifications	10 for a product

Updated over 1 year ago