HomeDev guideRecipesAPI ReferenceGraphQL
Dev guideUser GuideGitHubNuGetDev CommunitySubmit a ticketLog In
GitHubNuGetDev CommunitySubmit a ticket


Note

Semantic search is an experimental feature and may be subject to change.

Optimizely Graph supports matching and ranking of content beyond the standard lexical (literal keyword) matching using artificial intelligence (AI). The intent of the user query can be captured by extending it with context. The meaning of words can be part of that context. Language models have been created for that purpose. We can extend keyword matching with pre-trained language models. This approach to search is commonly called **semantic search**, but other names given to it are for example (but not necessarily synonyms) _neural search_, _vector search_, and _dense neural retrieval_. For more technical background, see the blog post, [Do you know what I mean? Introducing Semantic Search in Optimizely Graph](🔗).

Semantic search is a solution for the so-called _vocabulary mismatch problem_. The keywords entered by a user like a site visitor may not match with the words used by the content marketer. For example, a site visitor may enter in the search box "non-alcoholic cold beverage" and expect content items about "cola" being returned. With standard search, we do not return content if the keyword used in the query do not appear in the content, but with semantic search we can return content because it "knows" about the context of the query.

One way to solve the _vocabulary mismatch problem_ is by creating [synonyms](🔗), which Optimizely Graph supports. This is very effective, but can be time-consuming. Semantic search will be a good way to automate the expansion of queries with synonyms and improve the ranking of results. Other important use-case that this technology supports is supporting conversational AI (chatbots) by feeding it with relevant results which solves hallucination (a technique called Retrieval Augmented Generation). We can also use the vector search technology that drives semantic search to cluster or de-duplicate content and detect anomalies.

This technology to do seach has proven to be working the best in combination with traditional keyword search. That is, the system can return most relevant content in the right order that users want. This is what Optimizely Graph supports and refers to as semantic search, but which is effectively a mixed approach between standard keyword search and pure vector search.

Note

For existing accounts to use semantic search, you must reset your existing account. See [the reset account functionality instructions](🔗).

## How does Semantic Search work in Optimizely Graph?

Semantic search is supported on _searchable_ string fields, and for the full-text search operators `contains` and `match`. It is recommended to set fields that have a lot of content (such as the `MainBody` in the Optimizely CMS) as searchable to unlock the full-text search capabilities. Optimizely Graph uses a pre-trained model for semantic search.

Semantic search can be enabled simply by changing the value of `_ranking` in the [OrderBy](🔗) to `SEMANTIC`. By default Optimizely Graph uses `RELEVANCE` that is standard keyword search with [BM25 relevance](🔗) ranking. You can combine `SEMANTIC` with other ranking criteria, for instance, use the semantic search capabilities but rank the results by field value instead.

## Supported languages

The models for semantic search are trained on English and multilingual datasets, hence Graph supports the languages with the following locale values.

Language2-letter Locale
Arabicar
Bulgarianbg
Catalanca
CJK (special)cjk
Germande
Greekel
Englishen
Spanishes
Farsifa
Finnishfi
Frenchfr
Galiciangl
Hindihi
Hungarianhu
Armenianhy
Indonesianid
Italianit
Japaneseja
Koreanko
Kurdishku
Latvianlv
Dutchnl
Norwegianno
Polishpl
Romanianro
Russianru
Swedishsv
Thaith
Turkishtr
Ukrainianuk
Chinesezh

## Examples

Given the following query which the user wants content that are about `action movie`:



Optimizely Graph will correctly the most relevant content to the top. Note that both "action" and "movie" do not appear in the content.



Similarly, when a site visitor queries for `californian governor`:



The most relevant and expected content is returned to the top:



We can combine different ranking criteria, where we rank primarily rank by date in ascending order, and then with using the relevance score computed with semantic search as a tie-breaker.