Real-time segments definition
This topic describes the major concepts and building blocks of a real-time segment in Optimizely Data Platform (ODP).
Real-time segments are constructed from an arbitrary logical tree of conditions and an optional set of collections. The conditions determine which users belong to a segment and allow extraction of context from the interactions that led them to belong.
Note
There is an example segment definition we reference and expound upon at the end.
Conditions
Membership in a segment is decided by evaluating each user against a condition tree. At the core of a segment definition is its root condition. Conditions currently come in three varieties: combination conditions, customer conditions, and sequence conditions.
Combination conditions
Combination conditions provide support for logical trees of other conditions. These allow you to combine a set of conditions using AND or OR.
Customer conditions
A customer condition continuously evaluates a customer filter expression. All fields referenced by this expression must be rooted at the customer but may include join-through values (for example, customer.email
or customer.last_email_metadata.consent
). The expression as a whole must produce a boolean output type. It is always evaluated against the current customer record (and its latest join-through values). The condition is matched if (and only if) the customer filter expression evaluates to true
. As soon as this is no longer true, the condition returns to an unmatched state. Be aware that expressions follow sql-like null rules, such that it is possible for the output to be null instead (neither true
nor false
) in the presence of nulls. A null filter output results in an unmatched condition.
Sequence conditions
A sequence condition continuously evaluates whether a given sequence meets some criteria, which currently support a minimum count (and optional maximum count) of qualified sequence instances.
Sequences
Sequences are the core value proposition of real-time segments, as they enable detection of behavioral patterns over time. At its simplest, a sequence is comprised of an entry_event_filter
expression (which must evaluate to a boolean) and a maximum age (between 1 second and 28 days). You can then optionally extend a sequence with a goal and/or disqualifier continuation. You can also make it unique (for counting purposes) and keep a set of tracked data. The entry event filter expression may reference fields rooted at either the event or the customer (including all available join-through values), but it implicitly only reacts to events.
Sequence lifecycle
Before getting into all of the other details, it is important to understand the lifecycle of a sequence. Within the context of evaluating a given user's stream of events, any event that matches the entry event filter of a "root" sequence (more on that distinction later) spawns a new instance of that sequence. Each instance of a sequence has its own lifecycle that starts at the entry event and ends no later than max_age_seconds
after that entry event.
If the sequence does not have a goal
, this sequence instance is automatically "qualified", meaning it will count toward the criteria of its associated sequence condition. If there is a goal defined, then this instance does not become qualified until that goal is met. This qualified status is not permanent, however. If at any time the goal is no longer met, the instance is no longer qualified. Such an instance can gain or lose its qualified status any number of times throughout its lifecycle.
If the sequence has a disqualifier
, when that disqualifier is met, the instance is immediately destroyed and no longer counts toward the criteria of its sequence condition. The instance is also destroyed as soon as max_age_seconds
have elapsed since its entry event (regardless of disqualifier existence or status).
Note
The goal and disqualifier are not evaluated within a brand new sequence instance (that is, at the moment of its creation). Once an instance exists, these are evaluated with each new stimulus (event, customer-related update, or timeout).
Root and child sequences
Any sequence defined by a condition that is immediately within the root_condition
tree is a root sequence. Any sequence defined within a goal
or disqualifier
continuation tree is a child sequence, belonging to its parent sequence. A child sequence may in turn be the parent of other child sequences, and so on. The lifecycle of a child sequence is tied to the lifecycle of its parent instance. So a child sequence cannot spawn new instances until after its parent has been spawned. And child instances are automatically destroyed when their parent instance is destroyed.
Unique sequences
There are many scenarios in which a condition should be based on the number of unique interactions, according to some definition of uniqueness. Sequences can define a set of unique_key_sources
, which are expressions that may evaluate to any type. This does not affect the lifecycle of a sequence, but when evaluating the count for a sequence condition, only one instance is counted per unique set of values.
Tracked data
Sometimes it is useful to keep track of some data from the entry event that spawned a sequence instance. To do this, a sequence may define a set of tracked_data
. Each tracked data must have a key
(unique within a segment definition) and extracts a set of field_sub_paths
from a source_path
available during evaluation of the entry event. You can then reference the tracked data (like key.sub.path
) by expressions within continuations (and their continuations, and so on) and/or extract it via collections.
Continuations
A sequence may define a goal
and/or disqualifier
continuation tree. There are currently four types of continuations: combination continuations, event filter continuations, sequence condition continuations, and timeout source continuations.
Combination continuations
Combination continuations provide support for logical trees of other continuations. These allow you to combine a set of continuations with AND or OR.
Event filter continuations
An event filter continuation is satisfied as soon as any event matches the event_filter
expression (which must evaluate to a boolean). This is useful when the final step of a sequence is a singular event, and there is no need to track its data or age it off before its parent sequence. The event filter expression has access to all tracked data defined by any of its ancestor sequences.
Sequence condition continuations
A sequence condition continuation enables tracking and counting of a sequence that cannot begin until after its parent sequence has been spawned. This is useful (compared to an event filter continuation) when the continuation sequence either has its own continuations, needs to track its own data, needs age off before its parent, or has specific counting requirements. Every expression within a continuation sequence has access to all tracked data defined by any of its ancestor sequences.
Timeout source continuations
A timeout source continuation allows you to trigger a goal or disqualifier after some amount of time has elapsed. More precisely, the timeout_source
expression must evaluate to a number representing (in epoch seconds) the moment in time that the continuation is satisfied. Since the timeout source expression is evaluated at the moment of entry (and outside of the newly spawned instance), this expression does not have access to the tracked data defined by its immediate parent sequence. It can, however, still access those same fields directly from the entry event, just not via tracked data key. It also has access to all tracked data defined by any of its parent sequence's ancestors.
Caution
You must take care when designing timeout source expressions to ensure that the largest possible look-forward does not create an end-to-end time exceeding the maximum age of its root sequence. Failure to do so may produce unexpected results.
Timeout source continuations can be useful in a goal when the sequence should not be considered until some minimum amount of time has elapsed. For example, in the last 28 days (sequence max_age_seconds
) or added to cart at least 4 hours ago (goal timeout_source
).
They can also be useful in a disqualifier when the sequence's max age should be dynamic based on input. In this case, set max_age_seconds
to some reasonable upper bound, then provide a disqualifier timeout source to compute the actual value. One common scenario where you might encounter this is tracking the number of days on which something occurred. You can achieve this by using the day-truncation of event timestamp as the unique key source, then shifting that forward by some number of days for the timeout source.
The end result is that each instance ages off N
days after the start of the day in which it occurred (as opposed to the constant N * 86400
seconds after the event, which would not achieve the desired behavior).
Collections
When working with sequences, it is often useful to extract context about how a user qualified for the segment. For example, in a segment that finds users who have viewed at least three unique products more than once over the last week, it would likely be desirable to know which products they viewed in such a way.
Collections allow a segment to export the tracked data from qualified instances of a sequence. A collection can optionally reduce the fields exported from the tracked data to a set of narrowed_sub_paths
and/or be configured to only produce the unique
rows.
Important
You cannot collect tracked data from any sequence underneath a disqualifier continuation tree.
Expressions
Expressions are the heart of interaction with data. Any given expression produces a specific output type (which may or may not be dependent on an input type). The possible types are string, number, or boolean (timestamps are treated as numbers containing unix epoch seconds). Expressions currently support 13 different varieties, detailed below.
Note
Expressions follow sql-like null rules, such that it is possible for the output of a boolean-producing expression to be null (neither
true
norfalse
) in the presence of null inputs.
Combination expressions
Combination expressions provide support for logical trees of other expressions. These allow you to combine a set of expressions with AND or OR. This expression outputs a boolean, and all input expressions
must output a boolean. There must be at least one input expression. Combinations with no input expressions will be rejected.
Behavior in the face of null inputs depends on the conjuction
. For AND
, if any expression evaluates to false
, the output is false
. Otherwise, if any expression evaluates to null, the output is null. For OR
, if any expression evaluates to true
, the output is true
. Otherwise, if any expression evaluates to null, the output is null.
comparison
expressions
comparison
expressionsA comparison
expression evaluates an lhs
expression against an rhs
expression according to the rules of a comparator
. In general, the lhs
and rhs
expressions may evaluate to any type, but those types must be compatible, both with each other, and with the comparator
. A comparison always produces a boolean output type. If either input expression evaluates to null, the output is null.
General comparators
You can use the following comparators with any input type:
EQUAL
– determines exact equality (case sensitive for strings)NOT_EQUAL
– determines exact inequality (case sensitive for strings)
Note
As noted previously, but worth calling out again: a null value is neither equal to, nor not-equal to, another null value. If either input is null, so is the output.
Numeric comparators
You can only use the following comparators on number inputs:
LESS_THAN
– numeric<
LESS_THAN_OR_EQUAL
– numeric<=
GREATER_THAN
– numeric>
GREATER_THAN_OR_EQUAL
– numeric>=
String comparators
You can only use the following comparators on string inputs:
LIKE
– case-insensitive fuzzy matchNOT_LIKE
– case-insensitive fuzzy non-match
Note
These comparators require a rigid input format. Specifically, the
rhs
expression must be a string literal (not just any expression that evaluates to a string output type). Thevalue
of this string literal defines the fuzzy-matching pattern according to the following rules (in order of precedence):
[_]
– matches a literal_
character*
– matches zero or more characters- Any other sequence of characters matches that sequence of characters (case-insensitive)
- The beginning and end of the pattern are anchored to the beginning and end of the
lhs
string value
For example, to match any Optimizely email address, the pattern would simply be [email protected]
. To match any string that contains a literal underscore, the pattern would be _[_]_
. To match any string that contains the literal sequence [_]
, the pattern would be _[[_]]_
. To match any string that begins with the letter "a" (or "A"), the pattern would simply be a*
.
Set-membership comparators
The following comparators evaluate to the presence or absence of some item in a set of strings or numbers.
any_of_string
is true if the supplied string is in the set of candidate stringsnone_of_string
is false if the supplied string is in the set of candidate stringsany_of_number
is true if the supplied number is in the set of candidate numbersnone_of_number
is false if the supplied number is in the set of candidate numberslike_any_of_string
is true if the supplied string is LIKE any of the candidate stringslike_none_of_string
is false if the supplied string is LIKE any of the candidate strings
path-reference
expressions
path-reference
expressionsA path_reference
expression simply pulls the datum stored at the given path value
. Depending on context, a path may either be rooted at the event
, the customer
, or an available tracked data key. For example: event.event_type
, event.product.price
, customer.email
, customer.last_email_metadata.consent
, my_tracked_data.product.parent_product.brand
.
The input path must resolve to an available datum within the expressions current context. All others are rejected. A path reference expression outputs the same type as its source datum.
string-literal
expressions
string-literal
expressionsA string_literal
expression always produces a string output type with the specified value
.
number-literal
expressions
number-literal
expressionsA number_literal
expression always produces a number output type with the specified value
. The value is given as a string to support arbitrary precision, but it must contain a numeric value (whether integer or decimal).
boolean-literal
expressions
boolean-literal
expressionsA boolean_literal
expression always produces a boolean output type with the specified value
.
is-missing
expressions
is-missing
expressionsAn is_missing
expression always produces a boolean output. The input expression may produce any output type. If the source
expression evaluates to null, then it outputs true
. For any other source value, it outputs false
. It will never output null.
is-not-missing
expressions
is-not-missing
expressionsAn is_not_missing
expression always produces a boolean output. The input expression may produce any output type. If the source
expression evaluates to null, then it outputs false
. For any other source value, it outputs true
. It will never output null.
false-or-missing
expressions
false-or-missing
expressionsA false_or_missing
expression always produces a boolean output. The input expression must produce a boolean output type. If the source
expression evaluates to either false
or null, then it outputs true
; otherwise false
. It will never output null.
true-or-missing
expressions
true-or-missing
expressionsA true_or_missing
expression always produces a boolean output. The input expression must produce a boolean output type. If the source
expression evaluates to either true
or null, then it outputs true
; otherwise false
. It will never output null.
{
"entry_event_filter": {
"combination": {
"conjunction": "AND",
"expressions": [
{
"comparison": {
"lhs": {"path_reference": {"value": "event.event_type"}},
"comparator": "EQUAL",
"rhs": {"string_literal": {"value": "product"}}
}
},
{
"comparison": {
"lhs": {"path_reference": {"value": "event.vdl_action"}},
"comparator": "EQUAL",
"rhs": {"string_literal": {"value": "detail"}}
}
},
{
"comparison": {
"lhs": {"path_reference": {"value": "event.product.price"}},
"comparator": "GREATER_THAN_OR_EQUAL",
"rhs": {"number_literal": {"value": "100"}}
}
}
]
}
},
"unique_key_sources": [{"path_reference": {"value": "event.product_id"}}],
"tracked_data": [
{
"key": {"key": "pdp"},
"source_path": "event",
"field_sub_paths": ["product_id", "ts"]
}
],
"max_age_seconds": 2419200
}
coalesce
expressions
coalesce
expressionsA coalesce
expression takes between 2 and 5 (inclusive) ordered inputs and outputs the first one that is not null. If all of the inputs are null, it outputs null. The sources
may evaluate to any output type, but the types must all be the same. This expression will output the same type as its inputs.
truncate-time
expressions
truncate-time
expressionsA truncate_time
expression takes a source
expression (that must produce a number output type) containing a timestamp in unix epoch seconds and truncates it to a give time unit
(MINUTE
, HOUR
, or DAY
). For DAY
, the computation is performed in the given timezone
expression, which must evaluate to a string containing a valid IANA timezone name. If the timezone name is not recognized, the computation is performed in UTC. This expression produces a number output type containing a timestamp in unix epoch seconds. If any of the input expressions evaluate to null, the output is null.
shift-time
expressions
shift-time
expressionsA shift_time
expression takes a source
expression (that must produce a number output type) containing a timestamp in unix epoch seconds and shifts the timestamp by an amount
number of time unit
increments (SECOND
, MINUTE
, HOUR
, or DAY
). For DAY
, the computation is performed in the given timezone
expression, which must evaluate to a string containing a valid IANA timezone name. If the timezone name is not recognized, the computation is performed in UTC. This expression produces a number output type containing a timestamp in unix epoch seconds. If any of the input expressions evaluate to null, the output is null.
is_list_member
expressions
is_list_member
expressionsThe is_list_member
expression evaluates if a customer is subscribed to a particular list. As an input, it takes list_id
of a list to which you are checking subscriptions. You can use this as a part of customer_condition
.
The following example shows a segment definition that contains all customers who are subscribed to the list with list_id
= test_list
:
{
"definition": {
"root_condition": {
"customer_condition": {
"customer_filter": {
"is_list_member": {
"list": "test_list"
}
}
}
}
},
"description": "Test segment for subscription to test_list"
}
is_not_list_member
expressions
is_not_list_member
expressionsThe is_not_list_member
expression evaluates if a customer is not subscribed to a particular list. As an input, it takes list_id
of a list for which you want to check subscriptions. You can use this as a part of the customer_condition
. When the expression returns true
, either the customer is not subscribed to the list, or the list does not exist.
The following example shows a segment definition that contains all customers who are not subscribed to the list with list_id
= test_list
(if the list exists):
{
"definition": {
"root_condition": {
"customer_condition": {
"customer_filter": {
"is_not_list_member": {
"list": "test_list"
}
}
}
}
},
"description": "Segment contains all customers who are not subscribed to test_list"
}
Example segment definition
You can view the complete SegmentDefinition
protobuf for this example below. For the purpose of exposition, this document will just pull out relevant snippets as needed.
{
"description": "high value customers who have abandoned valuable products",
"definition": {
"root_condition": {
"combination": {
"conjunction": "AND",
"conditions": [
{
"customer_condition": {
"customer_filter": {
"comparison": {
"lhs": {"path_reference": {"value": "customer.observations.total_revenue"}},
"comparator": "GREATER_THAN_OR_EQUAL",
"rhs": {"number_literal": {"value": "250"}}
}
}
}
},
{
"sequence_condition": {
"sequence": {
"entry_event_filter": {
"combination": {
"conjunction": "AND",
"expressions": [
{
"comparison": {
"lhs": {"path_reference": {"value": "event.event_type"}},
"comparator": "EQUAL",
"rhs": {"string_literal": {"value": "product"}}
}
},
{
"comparison": {
"lhs": {"path_reference": {"value": "event.action"}},
"comparator": "EQUAL",
"rhs": {"string_literal": {"value": "detail"}}
}
},
{
"comparison": {
"lhs": {"path_reference": {"value": "event.product.price"}},
"comparator": "GREATER_THAN_OR_EQUAL",
"rhs": {"number_literal": {"value": "100"}}
}
}
]
}
},
"unique_key_sources": [{"path_reference": {"value": "event.product_id"}}],
"tracked_data": [
{
"key": {"key": "pdp"},
"source_path": "event",
"field_sub_paths": ["product_id", "ts"]
}
],
"max_age_seconds": 2419200,
"goal": {
"event_filter": {
"combination": {
"conjunction": "AND",
"expressions": [
{
"comparison": {
"lhs": {"path_reference": {"value": "event.event_type"}},
"comparator": "EQUAL",
"rhs": {"string_literal": {"value": "product"}}
}
},
{
"comparison": {
"lhs": {"path_reference": {"value": "event.action"}},
"comparator": "EQUAL",
"rhs": {"string_literal": {"value": "detail"}}
}
},
{
"comparison": {
"lhs": {"path_reference": {"value": "event.product_id"}},
"comparator": "EQUAL",
"rhs": {"path_reference": {"value": "pdp.product_id"}}
}
},
{
"comparison": {
"lhs": {
"truncate_time": {
"source": {"path_reference": {"value": "event.ts"}},
"unit": "DAY",
"timezone": {
"coalesce": {
"sources": [
{"path_reference": {"value": "customer.timezone"}},
{"string_literal": {"value": "America/New_York"}}
]
}
}
}
},
"comparator": "NOT_EQUAL",
"rhs": {
"truncate_time": {
"source": {"path_reference": {"value": "pdp.ts"}},
"unit": "DAY",
"timezone": {
"coalesce": {
"sources": [
{"path_reference": {"value": "customer.timezone"}},
{"string_literal": {"value": "America/New_York"}}
]
}
}
}
}
}
}
]
}
}
},
"disqualifier": {
"event_filter": {
"combination": {
"conjunction": "OR",
"expressions": [
{
"combination": {
"conjunction": "AND",
"expressions": [
{
"comparison": {
"lhs": {"path_reference": {"value": "event.event_type"}},
"comparator": "EQUAL",
"rhs": {"string_literal": {"value": "product"}}
}
},
{
"comparison": {
"lhs": {"path_reference": {"value": "event.action"}},
"comparator": "EQUAL",
"rhs": {"string_literal": {"value": "add_to_cart"}}
}
},
{
"comparison": {
"lhs": {"path_reference": {"value": "event.product_id"}},
"comparator": "EQUAL",
"rhs": {"path_reference": {"value": "pdp.product_id"}}
}
}
]
}
},
{
"combination": {
"conjunction": "AND",
"expressions": [
{
"comparison": {
"lhs": {"path_reference": {"value": "event.event_type"}},
"comparator": "EQUAL",
"rhs": {"string_literal": {"value": "order"}}
}
},
{
"comparison": {
"lhs": {"path_reference": {"value": "event.action"}},
"comparator": "EQUAL",
"rhs": {"string_literal": {"value": "purchase"}}
}
}
]
}
}
]
}
}
}
},
"min_count": 3
}
}
]
}
},
"collections": [
{
"key": "high_interest_products",
"source": {"key": "pdp"},
"narrowed_sub_paths": ["product_id"],
"unique": true
}
]
}
}
Target audience
This segment is targeting customers who meet the following criteria:
- Have a lifetime revenue of at least $250
- Are exhibiting prolonged interest (PDP on 2+ days) in at least 3 different high-value products ($100+)
- Have not added those products to cart since showing interest
- Have not made any purchase since showing interest
From a fictitious marketer's perspective, this could be a chance to convert by offering some sort of incentive, or perhaps by helping them decide among the products that they have been considering.
Condition breakdown
The four criteria above really map to two core conditions:
- A
customer_condition
against the lifetime revenue observation - A
sequence_condition
to detect the prolonged interest (disqualified by adding it to cart or making any purchase)
Since these both must be met, the segment definition's root_condition
must be a combination
using AND
:
{
"root_condition": {
"combination": {
"conjunction": "AND",
"conditions": []
}
}
}
We can then add the customer and sequence conditions into the conditions
array.
High-value customer
Narrowing in on customers with lifetime revenue of at least $250 is quite simple with customer observations.
{
"customer_condition": {
"customer_filter": {
"comparison": {
"lhs": {"path_reference": {"value": "customer.observations.total_revenue"}},
"comparator": "GREATER_THAN_OR_EQUAL",
"rhs": {"number_literal": {"value": "250"}}
}
}
}
}
This compares to see if customer.observations.total_revenue
(path reference) is at least 250
(number literal).
Prolonged high-value interest
We are looking for prolonged interest in at least three different high-value products. Before thinking about how to detect prolonged interest, note a few key details from the previous statement. First, we are looking for "at least 3" occurrences, so that implies a root sequence_condition
with min_count
of 3
. Second, we are looking for unique occurrences per product, which would translate to use of event.product_id
as a unique key. And third, there is no specific time-limit mentioned, so we will just start with the largest allowable max_age_seconds
, which is 28 days (in seconds). With these pieces of information, we can start to outline the root sequence condition:
{
"sequence_condition": {
"sequence": {
"entry_event_filter": ...,
"unique_key_sources": [{"path_reference": {"value": "event.product_id"}}],
"max_age_seconds": 2419200
},
"min_count": 3
}
}
Now fill in more of the sequence
. We are defining "prolonged interest" as PDP on 2+ days. So we will start with a "high-value" PDP entry_event_filter
. In sql-like syntax this is:
event.event_type = 'product' and event.vdl_action = 'detail' and event.product.price >= 100
We also need to hold onto which event.product_id
was being viewed and when (event.ts
), so we will need a tracked_data
. The upgraded sequence
is now:
{
"entry_event_filter": {
"combination": {
"conjunction": "AND",
"expressions": [
{
"comparison": {
"lhs": {"path_reference": {"value": "event.event_type"}},
"comparator": "EQUAL",
"rhs": {"string_literal": {"value": "product"}}
}
},
{
"comparison": {
"lhs": {"path_reference": {"value": "event.vdl_action"}},
"comparator": "EQUAL",
"rhs": {"string_literal": {"value": "detail"}}
}
},
{
"comparison": {
"lhs": {"path_reference": {"value": "event.product.price"}},
"comparator": "GREATER_THAN_OR_EQUAL",
"rhs": {"number_literal": {"value": "100"}}
}
}
]
}
},
"unique_key_sources": [{"path_reference": {"value": "event.product_id"}}],
"tracked_data": [
{
"key": {"key": "pdp"},
"source_path": "event",
"field_sub_paths": ["product_id", "ts"]
}
],
"max_age_seconds": 2419200
}
We now have a sequence that will detect interest in a product, but not prolonged interest (yet). Before moving onto the prolonged aspect, there are a couple of "have not" statements in our target audience (have not added those products to cart since showing interest and have not made any purchase since showing interest). You can satisfy these by adding a disqualifier
continuation to our root sequence.
These criteria are stated as two separate things, but you can combine them into a single event_filter
continuation. In sql-like syntax, we want to disqualify if any event matches the following filter:
(event.event_type = 'product' and event.vdl_action = 'add_to_cart' and event.product_id = pdp.product_id) or
(event.event_type = 'order' and event.vdl_action = 'purchase')
Note the reference to pdp.product_id
. This is using the sequence's tracked data with key pdp
to hone in on the specific product that spawned the current sequence instance.
So the translated disqualifier
is:
{
"event_filter": {
"combination": {
"conjunction": "OR",
"expressions": [
{
"combination": {
"conjunction": "AND",
"expressions": [
{
"comparison": {
"lhs": {"path_reference": {"value": "event.event_type"}},
"comparator": "EQUAL",
"rhs": {"string_literal": {"value": "product"}}
}
},
{
"comparison": {
"lhs": {"path_reference": {"value": "event.vdl_action"}},
"comparator": "EQUAL",
"rhs": {"string_literal": {"value": "add_to_cart"}}
}
},
{
"comparison": {
"lhs": {"path_reference": {"value": "event.product_id"}},
"comparator": "EQUAL",
"rhs": {"path_reference": {"value": "pdp.product_id"}}
}
}
]
}
},
{
"combination": {
"conjunction": "AND",
"expressions": [
{
"comparison": {
"lhs": {"path_reference": {"value": "event.event_type"}},
"comparator": "EQUAL",
"rhs": {"string_literal": {"value": "order"}}
}
},
{
"comparison": {
"lhs": {"path_reference": {"value": "event.vdl_action"}},
"comparator": "EQUAL",
"rhs": {"string_literal": {"value": "purchase"}}
}
}
]
}
}
]
}
}
}
And now we can move on to detecting prolonged interest. We have decided that the threshold here is performing the PDP on at least 3 different days. Something to consider here is the definition of day. In an ideal world, this would be based on the user's perspective. If that is not available, we could default to the shop owner's perspective (and our fictitious shop owner operates out of New York). With that in mind, we can compute the "day" of an event by truncating the timestamp to the day level in a timezone coalesced from the customer's timezone (which may not be known) and a default (America/New_York).
With that in mind, we can detect prolonged interest by adding a goal
continuation that is an event_filter
looking for a PDP event with the same product but on a different day (a continuation implicitly happens after the parent's entry event, so a simple not-equal comparison of the days is sufficient).
{
"event_filter": {
"combination": {
"conjunction": "AND",
"expressions": [
{
"comparison": {
"lhs": {"path_reference": {"value": "event.event_type"}},
"comparator": "EQUAL",
"rhs": {"string_literal": {"value": "product"}}
}
},
{
"comparison": {
"lhs": {"path_reference": {"value": "event.vdl_action"}},
"comparator": "EQUAL",
"rhs": {"string_literal": {"value": "detail"}}
}
},
{
"comparison": {
"lhs": {"path_reference": {"value": "event.product_id"}},
"comparator": "EQUAL",
"rhs": {"path_reference": {"value": "pdp.product_id"}}
}
},
{
"comparison": {
"lhs": {
"truncate_time": {
"source": {"path_reference": {"value": "event.ts"}},
"unit": "DAY",
"timezone": {
"coalesce": {
"sources": [
{"path_reference": {"value": "customer.timezone"}},
{"string_literal": {"value": "America/New_York"}}
]
}
}
}
},
"comparator": "NOT_EQUAL",
"rhs": {
"truncate_time": {
"source": {"path_reference": {"value": "pdp.ts"}},
"unit": "DAY",
"timezone": {
"coalesce": {
"sources": [
{"path_reference": {"value": "customer.timezone"}},
{"string_literal": {"value": "America/New_York"}}
]
}
}
}
}
}
}
]
}
}
}
Extraction of products
With the audience fully in place, we now would like to know the high-value products in which a given user showed prolonged interest. For this, we need to add the collections
list to the root of the segment definition. In our case, we only need one entry in this list. The products have already been captured via tracked_data
keyed as pdp
, but it also contains the timestamp of the event. We only care about the unique set of products, not the individual interactions with those products, so we can first narrow down to just the product_id
field, and then request unique
values.
{
"key": "high_interest_products",
"source": {"key": "pdp"},
"narrowed_sub_paths": ["product_id"],
"unique": true
}
A GraphQL consumer can now access the high_interest_products
via qualifications in the member audience, which also supports enrichment with additional product fields, including join-through fields from attached dimensions.
Updated about 2 months ago