Prevent indexing of PII data
Describes how to filter out personally identifiable information (PII) data to prevent indexing of this in Optimizely Search & Navigation.
This is an important part when managing GDPR compliance.
IGDPRConventions
and ITrackSanitizerPatternRepository
are used for adding the filtering.
Conventions
IGDPRConventions
has these methods.
Description | Sample |
---|---|
Set patterns to remove GDPR data from a search query. | public virtual void SetGDPRPatterns(List gdprPatterns) |
Get the GDPR patterns to be removed in search query | public virtual IEnumerable Get GDPRPatterns() |
Delete the GDPR data in the search query that matches the patterns. | public string RemoveGDPRDataInQuery(string queryStringQuery) |
ITrackSanitizerPatternRepository
The ITrackSanitizerPatternRepository
has these methods.
Method description | Sample |
---|---|
Add patterns to remove PII data from search query | |
Add single pattern | public string Add(TrackSanitizerPattern pattern) |
Add multiple patterns | public void Add(IEnumerable patterns) |
Update patterns to remove PII data from search query | |
Update single pattern | public string Update(TrackSanitizerPattern pattern) |
Update multiple patterns | public bool Update(IEnumerable patterns) |
Get patterns to remove PII in search query | |
Get all patterns | public IEnumerable GetAll() |
Get pattern by Id | public TrackSanitizerPattern Get(string patternId) |
Delete PII data in the search query that matched the patterns | |
Delete pattern by Id | public void Delete(string patternId) |
Delete all patterns | public void DeleteAll() |
Example
The patterns support plain text, wildcard, regex. Here are some example filters.
- Full name – “John Smith”, “Steven” …
- Keyword contains email – “*@gmail.com”, “*@yahoo.com” …
- Regex string – “\w+([-+.]\w+)*@\w+([-.]\w+)*.\w+([-.]\w+)*” …
public class Sample
{
protected IClient _client;
protected IStatisticsClient _statisticsClient;
protected ITrackSanitizerPatternRepository _trackSaniziterRepository;
public Sample(IClient client)
{
_client = client;
_statisticsClient = client.Statistics();
_trackSaniziterRepository = client.TrackSanitizer().TrackSaniziterRepository;
}
public void Test()
{
// Setting and add sanitizer patterns.
_trackSaniziterRepository.Add(new List<TrackSanitizerPattern>
{
new TrackSanitizerPattern
{
PatternString = "admin",
PatternType = TrackSanitizerFilterType.PlainText
},
new TrackSanitizerPattern
{
PatternString = "email",
PatternType = TrackSanitizerFilterType.PlainText
},
new TrackSanitizerPattern
{
PatternString = "*@mail.com",
PatternType = TrackSanitizerFilterType.Wildcard
},
new TrackSanitizerPattern
{
PatternString = "1#1",
PatternType = TrackSanitizerFilterType.Wildcard
},
new TrackSanitizerPattern
{
PatternString = "c[a-e]ll",
PatternType = TrackSanitizerFilterType.Wildcard
},
new TrackSanitizerPattern
{
PatternString = @"\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*",
PatternType = TrackSanitizerFilterType.Regex
}
});
// Doing Tracking behavior
var result = _client
.UnifiedSearchFor(@"[email protected]")
.StatisticsTrack()
.GetResult();
// Try to get GDPR data by exact term matched sanitize pattern.
var response = _statisticsClient.GetGDPR("[email protected]", x => { });
}
};
The _statisticsClient.GetGDPR()
API only support exact term search, due to limitations of statistics indexes.
Install and verifiy
In the steps below we describe how to implement and verify the PII filtering.
- CMS Alloy sample site (for CMS 11 and Commerce 13) installed from the Visual Studio Extension. See also Installing Optimizely .NET5 for CMS 12 and Commerce 14.
- Optimizely Search & Navigation service URL and default index name, for example http://es-api-test01.episerver.com/<PRIVATE_KEY>.
- Optimizely Search & Navigation client-side resource base URL, for example https://dl.episerver.net/13.2.0.
- Optimizely Search & Navigation 13.2.0
Install packages
-
In Visual Studio, set the default project to Templates.Alloy.
-
Install the following NuGet packages (use the “-pre” option to get latest development package).
- Find.Cms
- Find.Statistics
-
Open the Alloy web.config file and update the following entries:
- In the <episerver.find> tag
serviceUrl
defaultIndex
- In the <episerver.find.ui> tag
clientSideResourceBaseUrl
- In the tag
- add item with key
episerver:Find.TrackingSanitizerEnabled
and value true
- add item with key
- In the <episerver.find> tag
-
Access Admin Mode and add a GDPR test page.
a. Go to CMS > Admin > Content Type tab > Page Types > [Specialized] Start Page > Settings.

b. Click Available Page Types and check [Specialized] Find GDPR API Demo Page and click Save.

- Go to the CMS Edit > navigation pane > Pages tab > Start branch of the tree structure.
- Create a GDPR Search page and publish it.

- Return to CMS > Admin view.
- Under Scheduled jobs, click Optimizely Find Content Indexing Job and start that job manually.
Verify
In these steps we perform a search, delete the GDPR-related data, and add a filtering pattern to prevent it from being indexed.
- Open the GDPR Demo page created in previous steps. Clear the GDPR pattern settings to verify that the tracking function is running well.

- Go to the search page and execute a search with some keywords.

- Go to the GDPR Demo page and review the displayed data.


- Delete the existing GDPR data and set patterns to prevent it.

- Search again and recheck for the GDPR data. This should now have been filtered out.


Updated 29 days ago