More like
Describes how to build functionality to search for related objects using the MoreLike method in Optimizely Search & Navigation.
Use the MoreLike
method to find documents whose text content is "like" a given string. Use this functionality to find related documents or objects.
Examples
A simple example can look like this:
searchResult = client.Search<BlogPost>()
.MoreLike("guitar")
.GetResult();
After invoking the MoreLike
method, you can customize the search query with several methods. For instance, because you do not have a lot of documents with similar content, you probably want to lower the minimum document frequency requirement. That is the level at which it ignores words that do not occur in at least many documents, which defaults to five.
searchResult = client.Search<BlogPost>()
.MoreLike("guitar")
.MinimumDocumentFrequency(1)
.GetResult();
A full list of extension methods for customizing the query follows below. But before you look at those, look at an example of finding documents "related" to a given document. Assuming you indexed two BlogPosts
with similar content, you can search for similar documents as the first and expect the second using a query such as this:
var firstBlogPost = //Some indexed blog post about guitars
var secondBlogPost = //Another blog post about guitars
searchResult = client.Search<BlogPost>()
.MoreLike(firstBlogPost.Content)
.MinimumDocumentFrequency(1)
.Filter(x => !x.Id.Match(firstBlogPost.Id))
.GetResult();
Note
When you issue these types of queries, use some caching because the result is not likely to change very often. Even if it does, a few minutes' delay might not matter.
Customize the query
As the nature of content can differ greatly between indexes and types, it is often a good idea to play around with available settings after having invoked the MoreLike
method. You can call the following methods to customize the query. See also the Elastic Search guide.
MinimumDocumentFrequency
– The frequency at which the search ignores words in at least this many documents. The default is 5.MaximumDocumentFrequency
– The maximum frequency in which words may still appear. The search ignores words that appear in more than this many documents. The default is unbounded.PercentTermsToMatch
– The percentage of terms to match on. The default is 30 (percent).MinimumTermFrequency
– The frequency below which search ignores terms in the source documents. The default frequency is 2.MinimumWordLength
– The minimum word length below which the search ignores words. The default is 0.MaximumWordLength
– The maximum word length above which the search ignores words. The default is unbounded (0).MaximumQueryTerms
– The maximum number of query terms the search includes in any generated query. The default is 25.StopWords
– A list of words search ignores as "uninteresting."
Updated 9 months ago