Prevent indexing of PII data
Describes how to filter out personally identifiable information (PII) data to prevent indexing of this in Optimizely Search & Navigation.
This topic describes an important part in managing GDPR compliance.
IGDPRConventions
and ITrackSanitizerPatternRepository
are used for adding the filtering.
IGDPRConventions
IGDPRConventions
has these methods.
- Set patterns to remove GDPR data from a search query.
public virtual void SetGDPRPatterns(List<GDPRPattern> gdprPatterns)
- Get the GDPR patterns to be removed in a search query
public virtual IEnumerable<GDPRPattern> GetGDPRPatterns()
- Delete the GDPR data in the search query that matches the patterns.
public string RemoveGDPRDataInQuery(string queryStringQuery)
ITrackSanitizerPatternRepository
The ITrackSanitizerPatternRepository
has these methods.
Add patterns to remove PII data from search query
- Add single pattern
public string Add(TrackSanitizerPattern pattern)
- Add multiple patterns
public void Add(IEnumerable<TrackSanitizerPattern> patterns)
Update patterns to remove PII data from search query
- Update single pattern
public string Update(TrackSanitizerPattern pattern)
- Update multiple patterns
public void Update(IEnumerable<TrackSanitizerPattern> patterns)
Get patterns to remove PII in search query
- Get a pattern by ID
public TrackSanitizerPattern Get(string patternId)
- Get all patterns
public IEnumerable<TrackSanitizerPattern> GetAll()
Delete PII data in the search query that matched the patterns
- Delete a pattern by ID
public void Delete(string patternId)
- Delete all patterns
public void DeleteAll()
Example filters
The patterns support plain text, wildcards, and regex. Here are some example filters.
- Full name – “John Smith,” “Steven,” and so on.
- Keyword contains email –
*@gmail.com
,*@yahoo.com
, and so on. - Regex string –
\w+([-+.]\w+)*@\w+([-.]\w+)*.\w+([-.]\w+)*
and so on.
public class Sample {
protected IClient _client;
protected IStatisticsClient _statisticsClient;
protected ITrackSanitizerPatternRepository _trackSaniziterRepository;
public Sample(IClient client) {
_client = client;
_statisticsClient = client.Statistics();
_trackSaniziterRepository = client.TrackSanitizer().TrackSaniziterRepository;
}
public void Test() {
// Setting and add sanitizer patterns.
_trackSaniziterRepository.Add(new List<TrackSanitizerPattern> {
new TrackSanitizerPattern {
PatternString = "admin",
PatternType = TrackSanitizerFilterType.PlainText
},
new TrackSanitizerPattern {
PatternString = "email",
PatternType = TrackSanitizerFilterType.PlainText
},
new TrackSanitizerPattern {
PatternString = "*@mail.com",
PatternType = TrackSanitizerFilterType.Wildcard
},
new TrackSanitizerPattern {
PatternString = "1#1",
PatternType = TrackSanitizerFilterType.Wildcard
},
new TrackSanitizerPattern {
PatternString = "c[a-e]ll",
PatternType = TrackSanitizerFilterType.Wildcard
},
new TrackSanitizerPattern {
PatternString = @ "\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*",
PatternType = TrackSanitizerFilterType.Regex
}
});
// Doing Tracking behavior
var result = _client
.UnifiedSearchFor(@ "[email protected]")
.StatisticsTrack()
.GetResult();
// Try to get GDPR data by exact term matched sanitize pattern.
var response = _statisticsClient.GetGDPR("[email protected]", x => {});
}
};
The _statisticsClient.GetGDPR()
API only support exact term search due to limitations of statistics indexes.
Install and verify PII filtering
The steps below describe how to implement and verify the PII filtering.
- CMS Alloy sample site (for CMS 11 and Commerce 13) installed from the Visual Studio Extension. See also Install Optimizely .NET5 for CMS 12 and Commerce 14.
- Optimizely Search & Navigation service URL and default index name, for example
http://es-api-test01.episerver.com/>\<PRIVATE_KEY>
. - Optimizely Search & Navigation client-side resource base URL, for example
https://dl.episerver.net/13.2.0
. - Optimizely Search & Navigation 13.2.0
Install PII filtering packages
-
In Visual Studio, set the default project to
Templates.Alloy
. -
Install the following NuGet packages (use the “-pre” option to get the latest development package).
Find.Cms
Find.Statistics
-
Open the Alloy
web.config
file and update the following entries:- In the
<episerver.find>
tagserviceUrl
defaultIndex
- In the
<episerver.find.ui>
tagclientSideResourceBaseUrl
- In the
<appSettings>
tag- add an item with key
episerver:Find.TrackingSanitizerEnabled
and value true
- add an item with key
- In the
-
Access Admin Mode and add a GDPR test page.
a. Go to CMS > Admin > Content Type tab > Page Types > [Specialized] Start Page > Settings.
b. Click Available Page Types and check [Specialized] Find GDPR API Demo Page and click Save.
- Go to the CMS Edit > navigation panel > Pages tab > Start branch of the tree structure.
- Create a GDPR Search page and publish it.
- Return to CMS > Admin view.
- Under Scheduled jobs, click Optimizely Find Content Indexing Job and start that job manually.
Verify PII filtering
In these steps, you perform a search, delete the GDPR-related data, and add a filtering pattern to prevent it from being indexed.
- Open the GDPR Demo page created in the previous steps. Clear the GDPR pattern settings to verify that the tracking function runs well.
- Go to the search page and execute a search with some keywords.
- Go to the GDPR Demo page and review the displayed data.
- Delete the existing GDPR data and set patterns to prevent it.
- Search again and recheck for the GDPR data, which should now be filtered out.
Updated 4 months ago