Dev GuideAPI Reference
Dev GuideAPI ReferenceUser GuideGitHubNuGetDev CommunitySubmit a ticketLog In
GitHubNuGetDev CommunitySubmit a ticket

Real-time segments definition

This topic describes the major concepts and building blocks of a real-time segment in Optimizely Data Platform (ODP).

Real-time segments are constructed from an arbitrary logical tree of conditions and an optional set of collections. The conditions determine which users belong to a segment and allow extraction of context from the interactions that led them to belong.

📘

Note

There is an example segment definition we reference and expound upon at the end.

Conditions

Membership in a segment is decided by evaluating each user against a condition tree. At the core of a segment definition is its root condition. Conditions currently come in three varieties: combination conditions, customer conditions, and sequence conditions.

Combination conditions

Combination conditions provide support for logical trees of other conditions. These allow you to combine a set of conditions using AND or OR.

Customer conditions

A customer condition continuously evaluates a customer filter expression. All fields referenced by this expression must be rooted at the customer but may include join-through values (for example, customer.email or customer.last_email_metadata.consent). The expression as a whole must produce a boolean output type. It is always evaluated against the current customer record (and its latest join-through values). The condition is matched if (and only if) the customer filter expression evaluates to true. As soon as this is no longer true, the condition returns to an unmatched state. Be aware that expressions follow sql-like null rules, such that it is possible for the output to be null instead (neither true nor false) in the presence of nulls. A null filter output results in an unmatched condition.

Sequence conditions

A sequence condition continuously evaluates whether a given sequence meets some criteria, which currently support a minimum count (and optional maximum count) of qualified sequence instances.

Sequences

Sequences are the core value proposition of real-time segments, as they enable detection of behavioral patterns over time. At its simplest, a sequence is comprised of an entry_event_filter expression (which must evaluate to a boolean) and a maximum age (between 1 second and 28 days). You can then optionally extend a sequence with a goal and/or disqualifier continuation. You can also make it unique (for counting purposes) and keep a set of tracked data. The entry event filter expression may reference fields rooted at either the event or the customer (including all available join-through values), but it implicitly only reacts to events.

Sequence lifecycle

Before getting into all of the other details, it is important to understand the lifecycle of a sequence. Within the context of evaluating a given user's stream of events, any event that matches the entry event filter of a "root" sequence (more on that distinction later) spawns a new instance of that sequence. Each instance of a sequence has its own lifecycle that starts at the entry event and ends no later than max_age_seconds after that entry event.

If the sequence does not have a goal, this sequence instance is automatically "qualified", meaning it will count toward the criteria of its associated sequence condition. If there is a goal defined, then this instance does not become qualified until that goal is met. This qualified status is not permanent, however. If at any time the goal is no longer met, the instance is no longer qualified. Such an instance can gain or lose its qualified status any number of times throughout its lifecycle.

If the sequence has a disqualifier, when that disqualifier is met, the instance is immediately destroyed and no longer counts toward the criteria of its sequence condition. The instance is also destroyed as soon as max_age_seconds have elapsed since its entry event (regardless of disqualifier existence or status).

📘

Note

The goal and disqualifier are not evaluated within a brand new sequence instance (that is, at the moment of its creation). Once an instance exists, these are evaluated with each new stimulus (event, customer-related update, or timeout).

Root and child sequences

Any sequence defined by a condition that is immediately within the root_condition tree is a root sequence. Any sequence defined within a goal or disqualifier continuation tree is a child sequence, belonging to its parent sequence. A child sequence may in turn be the parent of other child sequences, and so on. The lifecycle of a child sequence is tied to the lifecycle of its parent instance. So a child sequence cannot spawn new instances until after its parent has been spawned. And child instances are automatically destroyed when their parent instance is destroyed.

Unique sequences

There are many scenarios in which a condition should be based on the number of unique interactions, according to some definition of uniqueness. Sequences can define a set of unique_key_sources, which are expressions that may evaluate to any type. This does not affect the lifecycle of a sequence, but when evaluating the count for a sequence condition, only one instance is counted per unique set of values.

Tracked data

Sometimes it is useful to keep track of some data from the entry event that spawned a sequence instance. To do this, a sequence may define a set of tracked_data. Each tracked data must have a key (unique within a segment definition) and extracts a set of field_sub_paths from a source_path available during evaluation of the entry event. You can then reference the tracked data (like key.sub.path) by expressions within continuations (and their continuations, and so on) and/or extract it via collections.

Continuations

A sequence may define a goal and/or disqualifier continuation tree. There are currently four types of continuations: combination continuations, event filter continuations, sequence condition continuations, and timeout source continuations.

Combination continuations

Combination continuations provide support for logical trees of other continuations. These allow you to combine a set of continuations with AND or OR.

Event filter continuations

An event filter continuation is satisfied as soon as any event matches the event_filter expression (which must evaluate to a boolean). This is useful when the final step of a sequence is a singular event, and there is no need to track its data or age it off before its parent sequence. The event filter expression has access to all tracked data defined by any of its ancestor sequences.

Sequence condition continuations

A sequence condition continuation enables tracking and counting of a sequence that cannot begin until after its parent sequence has been spawned. This is useful (compared to an event filter continuation) when the continuation sequence either has its own continuations, needs to track its own data, needs age off before its parent, or has specific counting requirements. Every expression within a continuation sequence has access to all tracked data defined by any of its ancestor sequences.

Timeout source continuations

A timeout source continuation allows you to trigger a goal or disqualifier after some amount of time has elapsed. More precisely, the timeout_source expression must evaluate to a number representing (in epoch seconds) the moment in time that the continuation is satisfied. Since the timeout source expression is evaluated at the moment of entry (and outside of the newly spawned instance), this expression does not have access to the tracked data defined by its immediate parent sequence. It can, however, still access those same fields directly from the entry event, just not via tracked data key. It also has access to all tracked data defined by any of its parent sequence's ancestors.

🚧

Caution

You must take care when designing timeout source expressions to ensure that the largest possible look-forward does not create an end-to-end time exceeding the maximum age of its root sequence. Failure to do so may produce unexpected results.

Timeout source continuations can be useful in a goal when the sequence should not be considered until some minimum amount of time has elapsed. For example, in the last 28 days (sequence max_age_seconds) or added to cart at least 4 hours ago (goal timeout_source).

They can also be useful in a disqualifier when the sequence's max age should be dynamic based on input. In this case, set max_age_seconds to some reasonable upper bound, then provide a disqualifier timeout source to compute the actual value. One common scenario where you might encounter this is tracking the number of days on which something occurred. You can achieve this by using the day-truncation of event timestamp as the unique key source, then shifting that forward by some number of days for the timeout source.

The end result is that each instance ages off N days after the start of the day in which it occurred (as opposed to the constant N * 86400 seconds after the event, which would not achieve the desired behavior).

Collections

When working with sequences, it is often useful to extract context about how a user qualified for the segment. For example, in a segment that finds users who have viewed at least three unique products more than once over the last week, it would likely be desirable to know which products they viewed in such a way.

Collections allow a segment to export the tracked data from qualified instances of a sequence. A collection can optionally reduce the fields exported from the tracked data to a set of narrowed_sub_paths and/or be configured to only produce the unique rows.

🚧

Important

You cannot collect tracked data from any sequence underneath a disqualifier continuation tree.

Expressions

Expressions are the heart of interaction with data. Any given expression produces a specific output type (which may or may not be dependent on an input type). The possible types are string, number, or boolean (timestamps are treated as numbers containing unix epoch seconds). Expressions currently support 13 different varieties, detailed below.

📘

Note

Expressions follow sql-like null rules, such that it is possible for the output of a boolean-producing expression to be null (neither true nor false) in the presence of null inputs.

Combination expressions

Combination expressions provide support for logical trees of other expressions. These allow you to combine a set of expressions with AND or OR. This expression outputs a boolean, and all input expressions must output a boolean. There must be at least one input expression. Combinations with no input expressions will be rejected.

Behavior in the face of null inputs depends on the conjuction. For AND, if any expression evaluates to false, the output is false. Otherwise, if any expression evaluates to null, the output is null. For OR, if any expression evaluates to true, the output is true. Otherwise, if any expression evaluates to null, the output is null.

comparison expressions

A comparison expression evaluates an lhs expression against an rhs expression according to the rules of a comparator. In general, the lhs and rhs expressions may evaluate to any type, but those types must be compatible, both with each other, and with the comparator. A comparison always produces a boolean output type. If either input expression evaluates to null, the output is null.

General comparators

You can use the following comparators with any input type:

  • EQUAL – determines exact equality (case sensitive for strings)
  • NOT_EQUAL – determines exact inequality (case sensitive for strings)

📘

Note

As noted previously, but worth calling out again: a null value is neither equal to, nor not-equal to, another null value. If either input is null, so is the output.

Numeric comparators

You can only use the following comparators on number inputs:

  • LESS_THAN – numeric <
  • LESS_THAN_OR_EQUAL – numeric <=
  • GREATER_THAN – numeric >
  • GREATER_THAN_OR_EQUAL – numeric >=

String comparators

You can only use the following comparators on string inputs:

  • LIKE – case-insensitive fuzzy match
  • NOT_LIKE – case-insensitive fuzzy non-match

📘

Note

These comparators require a rigid input format. Specifically, the rhs expression must be a string literal (not just any expression that evaluates to a string output type). The value of this string literal defines the fuzzy-matching pattern according to the following rules (in order of precedence):

  • [_] – matches a literal _ character
  • * – matches zero or more characters
  • Any other sequence of characters matches that sequence of characters (case-insensitive)
  • The beginning and end of the pattern are anchored to the beginning and end of the lhs string value

For example, to match any Optimizely email address, the pattern would simply be [email protected]. To match any string that contains a literal underscore, the pattern would be _[_]_. To match any string that contains the literal sequence [_], the pattern would be _[[_]]_. To match any string that begins with the letter "a" (or "A"), the pattern would simply be a*.

Set-membership comparators

The following comparators evaluate to the presence or absence of some item in a set of strings or numbers.

  • any_of_string is true if the supplied string is in the set of candidate strings
  • none_of_string is false if the supplied string is in the set of candidate strings
  • any_of_number is true if the supplied number is in the set of candidate numbers
  • none_of_number is false if the supplied number is in the set of candidate numbers
  • like_any_of_string is true if the supplied string is LIKE any of the candidate strings
  • like_none_of_string is false if the supplied string is LIKE any of the candidate strings

path-reference expressions

A path_reference expression simply pulls the datum stored at the given path value. Depending on context, a path may either be rooted at the event, the customer, or an available tracked data key. For example: event.event_type, event.product.price, customer.email, customer.last_email_metadata.consent, my_tracked_data.product.parent_product.brand.

The input path must resolve to an available datum within the expressions current context. All others are rejected. A path reference expression outputs the same type as its source datum.

string-literal expressions

A string_literal expression always produces a string output type with the specified value.

number-literal expressions

A number_literal expression always produces a number output type with the specified value. The value is given as a string to support arbitrary precision, but it must contain a numeric value (whether integer or decimal).

boolean-literal expressions

A boolean_literal expression always produces a boolean output type with the specified value.

is-missing expressions

An is_missing expression always produces a boolean output. The input expression may produce any output type. If the source expression evaluates to null, then it outputs true. For any other source value, it outputs false. It will never output null.

is-not-missing expressions

An is_not_missing expression always produces a boolean output. The input expression may produce any output type. If the source expression evaluates to null, then it outputs false. For any other source value, it outputs true. It will never output null.

false-or-missing expressions

A false_or_missing expression always produces a boolean output. The input expression must produce a boolean output type. If the source expression evaluates to either false or null, then it outputs true; otherwise false. It will never output null.

true-or-missing expressions

A true_or_missing expression always produces a boolean output. The input expression must produce a boolean output type. If the source expression evaluates to either true or null, then it outputs true; otherwise false. It will never output null.

{
  "entry_event_filter": {
    "combination": {
      "conjunction": "AND",
      "expressions": [
        {
          "comparison": {
            "lhs": {"path_reference": {"value": "event.event_type"}},
            "comparator": "EQUAL",
            "rhs": {"string_literal": {"value": "product"}}
          }
        },
        {
          "comparison": {
            "lhs": {"path_reference": {"value": "event.vdl_action"}},
            "comparator": "EQUAL",
            "rhs": {"string_literal": {"value": "detail"}}
          }
        },
        {
          "comparison": {
            "lhs": {"path_reference": {"value": "event.product.price"}},
            "comparator": "GREATER_THAN_OR_EQUAL",
            "rhs": {"number_literal": {"value": "100"}}
          }
        }
      ]
    }
  },
  "unique_key_sources": [{"path_reference": {"value": "event.product_id"}}],
  "tracked_data": [
    {
      "key": {"key": "pdp"},
      "source_path": "event",
      "field_sub_paths": ["product_id", "ts"]
    }
  ],
  "max_age_seconds": 2419200
}

coalesce expressions

A coalesce expression takes between 2 and 5 (inclusive) ordered inputs and outputs the first one that is not null. If all of the inputs are null, it outputs null. The sources may evaluate to any output type, but the types must all be the same. This expression will output the same type as its inputs.

truncate-time expressions

A truncate_time expression takes a source expression (that must produce a number output type) containing a timestamp in unix epoch seconds and truncates it to a give time unit (MINUTE, HOUR, or DAY). For DAY, the computation is performed in the given timezone expression, which must evaluate to a string containing a valid IANA timezone name. If the timezone name is not recognized, the computation is performed in UTC. This expression produces a number output type containing a timestamp in unix epoch seconds. If any of the input expressions evaluate to null, the output is null.

shift-time expressions

A shift_time expression takes a source expression (that must produce a number output type) containing a timestamp in unix epoch seconds and shifts the timestamp by an amount number of time unit increments (SECOND, MINUTE, HOUR, or DAY). For DAY, the computation is performed in the given timezone expression, which must evaluate to a string containing a valid IANA timezone name. If the timezone name is not recognized, the computation is performed in UTC. This expression produces a number output type containing a timestamp in unix epoch seconds. If any of the input expressions evaluate to null, the output is null.

is_list_member expressions

The is_list_member expression evaluates if a customer is subscribed to a particular list. As an input, it takes list_id of a list to which you are checking subscriptions. You can use this as a part of customer_condition.

The following example shows a segment definition that contains all customers who are subscribed to the list with list_id = test_list:

{
  "definition": {
    "root_condition": {
      "customer_condition": {
        "customer_filter": {
          "is_list_member": {
            "list": "test_list"
          }
        }
      }
    }
  },
  "description": "Test segment for subscription to test_list"
}

is_not_list_member expressions

The is_not_list_member expression evaluates if a customer is not subscribed to a particular list. As an input, it takes list_id of a list for which you want to check subscriptions. You can use this as a part of the customer_condition. When the expression returns true, either the customer is not subscribed to the list, or the list does not exist.

The following example shows a segment definition that contains all customers who are not subscribed to the list with list_id = test_list (if the list exists):

{
  "definition": {
    "root_condition": {
      "customer_condition": {
        "customer_filter": {
          "is_not_list_member": {
            "list": "test_list"
          }
        }
      }
    }
  },
  "description": "Segment contains all customers who are not subscribed to test_list"
}

Example segment definition

You can view the complete SegmentDefinition protobuf for this example below. For the purpose of exposition, this document will just pull out relevant snippets as needed.

{
  "description": "high value customers who have abandoned valuable products",
  "definition": {
    "root_condition": {
      "combination": {
        "conjunction": "AND",
        "conditions": [
          {
            "customer_condition": {
              "customer_filter": {
                "comparison": {
                  "lhs": {"path_reference": {"value": "customer.observations.total_revenue"}},
                  "comparator": "GREATER_THAN_OR_EQUAL",
                  "rhs": {"number_literal": {"value": "250"}}
                }
              }
            }
          },
          {
            "sequence_condition": {
              "sequence": {
                "entry_event_filter": {
                  "combination": {
                    "conjunction": "AND",
                    "expressions": [
                      {
                        "comparison": {
                          "lhs": {"path_reference": {"value": "event.event_type"}},
                          "comparator": "EQUAL",
                          "rhs": {"string_literal": {"value": "product"}}
                        }
                      },
                      {
                        "comparison": {
                          "lhs": {"path_reference": {"value": "event.action"}},
                          "comparator": "EQUAL",
                          "rhs": {"string_literal": {"value": "detail"}}
                        }
                      },
                      {
                        "comparison": {
                          "lhs": {"path_reference": {"value": "event.product.price"}},
                          "comparator": "GREATER_THAN_OR_EQUAL",
                          "rhs": {"number_literal": {"value": "100"}}
                        }
                      }
                    ]
                  }
                },
                "unique_key_sources": [{"path_reference": {"value": "event.product_id"}}],
                "tracked_data": [
                  {
                    "key": {"key": "pdp"},
                    "source_path": "event",
                    "field_sub_paths": ["product_id", "ts"]
                  }
                ],
                "max_age_seconds": 2419200,
                "goal": {
                  "event_filter": {
                    "combination": {
                      "conjunction": "AND",
                      "expressions": [
                        {
                          "comparison": {
                            "lhs": {"path_reference": {"value": "event.event_type"}},
                            "comparator": "EQUAL",
                            "rhs": {"string_literal": {"value": "product"}}
                          }
                        },
                        {
                          "comparison": {
                            "lhs": {"path_reference": {"value": "event.action"}},
                            "comparator": "EQUAL",
                            "rhs": {"string_literal": {"value": "detail"}}
                          }
                        },
                        {
                          "comparison": {
                            "lhs": {"path_reference": {"value": "event.product_id"}},
                            "comparator": "EQUAL",
                            "rhs": {"path_reference": {"value": "pdp.product_id"}}
                          }
                        },
                        {
                          "comparison": {
                            "lhs": {
                              "truncate_time": {
                                "source": {"path_reference": {"value": "event.ts"}},
                                "unit": "DAY",
                                "timezone": {
                                  "coalesce": {
                                    "sources": [
                                      {"path_reference": {"value": "customer.timezone"}},
                                      {"string_literal": {"value": "America/New_York"}}
                                    ]
                                  }
                                }
                              }
                            },
                            "comparator": "NOT_EQUAL",
                            "rhs": {
                              "truncate_time": {
                                "source": {"path_reference": {"value": "pdp.ts"}},
                                "unit": "DAY",
                                "timezone": {
                                  "coalesce": {
                                    "sources": [
                                      {"path_reference": {"value": "customer.timezone"}},
                                      {"string_literal": {"value": "America/New_York"}}
                                    ]
                                  }
                                }
                              }
                            }
                          }
                        }
                      ]
                    }
                  }
                },
                "disqualifier": {
                  "event_filter": {
                    "combination": {
                      "conjunction": "OR",
                      "expressions": [
                        {
                          "combination": {
                            "conjunction": "AND",
                            "expressions": [
                              {
                                "comparison": {
                                  "lhs": {"path_reference": {"value": "event.event_type"}},
                                  "comparator": "EQUAL",
                                  "rhs": {"string_literal": {"value": "product"}}
                                }
                              },
                              {
                                "comparison": {
                                  "lhs": {"path_reference": {"value": "event.action"}},
                                  "comparator": "EQUAL",
                                  "rhs": {"string_literal": {"value": "add_to_cart"}}
                                }
                              },
                              {
                                "comparison": {
                                  "lhs": {"path_reference": {"value": "event.product_id"}},
                                  "comparator": "EQUAL",
                                  "rhs": {"path_reference": {"value": "pdp.product_id"}}
                                }
                              }
                            ]
                          }
                        },
                        {
                          "combination": {
                            "conjunction": "AND",
                            "expressions": [
                              {
                                "comparison": {
                                  "lhs": {"path_reference": {"value": "event.event_type"}},
                                  "comparator": "EQUAL",
                                  "rhs": {"string_literal": {"value": "order"}}
                                }
                              },
                              {
                                "comparison": {
                                  "lhs": {"path_reference": {"value": "event.action"}},
                                  "comparator": "EQUAL",
                                  "rhs": {"string_literal": {"value": "purchase"}}
                                }
                              }
                            ]
                          }
                        }
                      ]
                    }
                  }
                }
              },
              "min_count": 3
            }
          }
        ]
      }
    },
    "collections": [
      {
        "key": "high_interest_products",
        "source": {"key": "pdp"},
        "narrowed_sub_paths": ["product_id"],
        "unique": true
      }
    ]
  }
}

Target audience

This segment is targeting customers who meet the following criteria:

  1. Have a lifetime revenue of at least $250
  2. Are exhibiting prolonged interest (PDP on 2+ days) in at least 3 different high-value products ($100+)
  3. Have not added those products to cart since showing interest
  4. Have not made any purchase since showing interest

From a fictitious marketer's perspective, this could be a chance to convert by offering some sort of incentive, or perhaps by helping them decide among the products that they have been considering.

Condition breakdown

The four criteria above really map to two core conditions:

  1. A customer_condition against the lifetime revenue observation
  2. A sequence_condition to detect the prolonged interest (disqualified by adding it to cart or making any purchase)

Since these both must be met, the segment definition's root_condition must be a combination using AND:

{
  "root_condition": {
    "combination": {
      "conjunction": "AND",
      "conditions": []
    }
  }
}

We can then add the customer and sequence conditions into the conditions array.

High-value customer

Narrowing in on customers with lifetime revenue of at least $250 is quite simple with customer observations.

{
  "customer_condition": {
    "customer_filter": {
      "comparison": {
        "lhs": {"path_reference": {"value": "customer.observations.total_revenue"}},
        "comparator": "GREATER_THAN_OR_EQUAL",
        "rhs": {"number_literal": {"value": "250"}}
      }
    }
  }
}

This compares to see if customer.observations.total_revenue (path reference) is at least 250 (number literal).

Prolonged high-value interest

We are looking for prolonged interest in at least three different high-value products. Before thinking about how to detect prolonged interest, note a few key details from the previous statement. First, we are looking for "at least 3" occurrences, so that implies a root sequence_condition with min_count of 3. Second, we are looking for unique occurrences per product, which would translate to use of event.product_id as a unique key. And third, there is no specific time-limit mentioned, so we will just start with the largest allowable max_age_seconds, which is 28 days (in seconds). With these pieces of information, we can start to outline the root sequence condition:

{
  "sequence_condition": {
    "sequence": {
      "entry_event_filter": ...,
      "unique_key_sources": [{"path_reference": {"value": "event.product_id"}}],
      "max_age_seconds": 2419200
    },
    "min_count": 3
  }
}

Now fill in more of the sequence. We are defining "prolonged interest" as PDP on 2+ days. So we will start with a "high-value" PDP entry_event_filter. In sql-like syntax this is:

event.event_type = 'product' and event.vdl_action = 'detail' and event.product.price >= 100

We also need to hold onto which event.product_id was being viewed and when (event.ts), so we will need a tracked_data. The upgraded sequence is now:

{
  "entry_event_filter": {
    "combination": {
      "conjunction": "AND",
      "expressions": [
        {
          "comparison": {
            "lhs": {"path_reference": {"value": "event.event_type"}},
            "comparator": "EQUAL",
            "rhs": {"string_literal": {"value": "product"}}
          }
        },
        {
          "comparison": {
            "lhs": {"path_reference": {"value": "event.vdl_action"}},
            "comparator": "EQUAL",
            "rhs": {"string_literal": {"value": "detail"}}
          }
        },
        {
          "comparison": {
            "lhs": {"path_reference": {"value": "event.product.price"}},
            "comparator": "GREATER_THAN_OR_EQUAL",
            "rhs": {"number_literal": {"value": "100"}}
          }
        }
      ]
    }
  },
  "unique_key_sources": [{"path_reference": {"value": "event.product_id"}}],
  "tracked_data": [
    {
      "key": {"key": "pdp"},
      "source_path": "event",
      "field_sub_paths": ["product_id", "ts"]
    }
  ],
  "max_age_seconds": 2419200
}

We now have a sequence that will detect interest in a product, but not prolonged interest (yet). Before moving onto the prolonged aspect, there are a couple of "have not" statements in our target audience (have not added those products to cart since showing interest and have not made any purchase since showing interest). You can satisfy these by adding a disqualifier continuation to our root sequence.

These criteria are stated as two separate things, but you can combine them into a single event_filter continuation. In sql-like syntax, we want to disqualify if any event matches the following filter:

(event.event_type = 'product' and event.vdl_action = 'add_to_cart' and event.product_id = pdp.product_id) or
(event.event_type = 'order' and event.vdl_action = 'purchase')

Note the reference to pdp.product_id. This is using the sequence's tracked data with key pdp to hone in on the specific product that spawned the current sequence instance.

So the translated disqualifier is:

{
  "event_filter": {
    "combination": {
      "conjunction": "OR",
      "expressions": [
        {
          "combination": {
            "conjunction": "AND",
            "expressions": [
              {
                "comparison": {
                  "lhs": {"path_reference": {"value": "event.event_type"}},
                  "comparator": "EQUAL",
                  "rhs": {"string_literal": {"value": "product"}}
                }
              },
              {
                "comparison": {
                  "lhs": {"path_reference": {"value": "event.vdl_action"}},
                  "comparator": "EQUAL",
                  "rhs": {"string_literal": {"value": "add_to_cart"}}
                }
              },
              {
                "comparison": {
                  "lhs": {"path_reference": {"value": "event.product_id"}},
                  "comparator": "EQUAL",
                  "rhs": {"path_reference": {"value": "pdp.product_id"}}
                }
              }
            ]
          }
        },
        {
          "combination": {
            "conjunction": "AND",
            "expressions": [
              {
                "comparison": {
                  "lhs": {"path_reference": {"value": "event.event_type"}},
                  "comparator": "EQUAL",
                  "rhs": {"string_literal": {"value": "order"}}
                }
              },
              {
                "comparison": {
                  "lhs": {"path_reference": {"value": "event.vdl_action"}},
                  "comparator": "EQUAL",
                  "rhs": {"string_literal": {"value": "purchase"}}
                }
              }
            ]
          }
        }
      ]
    }
  }
}

And now we can move on to detecting prolonged interest. We have decided that the threshold here is performing the PDP on at least 3 different days. Something to consider here is the definition of day. In an ideal world, this would be based on the user's perspective. If that is not available, we could default to the shop owner's perspective (and our fictitious shop owner operates out of New York). With that in mind, we can compute the "day" of an event by truncating the timestamp to the day level in a timezone coalesced from the customer's timezone (which may not be known) and a default (America/New_York).

With that in mind, we can detect prolonged interest by adding a goal continuation that is an event_filter looking for a PDP event with the same product but on a different day (a continuation implicitly happens after the parent's entry event, so a simple not-equal comparison of the days is sufficient).

{
  "event_filter": {
    "combination": {
      "conjunction": "AND",
      "expressions": [
        {
          "comparison": {
            "lhs": {"path_reference": {"value": "event.event_type"}},
            "comparator": "EQUAL",
            "rhs": {"string_literal": {"value": "product"}}
          }
        },
        {
          "comparison": {
            "lhs": {"path_reference": {"value": "event.vdl_action"}},
            "comparator": "EQUAL",
            "rhs": {"string_literal": {"value": "detail"}}
          }
        },
        {
          "comparison": {
            "lhs": {"path_reference": {"value": "event.product_id"}},
            "comparator": "EQUAL",
            "rhs": {"path_reference": {"value": "pdp.product_id"}}
          }
        },
        {
          "comparison": {
            "lhs": {
              "truncate_time": {
                "source": {"path_reference": {"value": "event.ts"}},
                "unit": "DAY",
                "timezone": {
                  "coalesce": {
                    "sources": [
                      {"path_reference": {"value": "customer.timezone"}},
                      {"string_literal": {"value": "America/New_York"}}
                    ]
                  }
                }
              }
            },
            "comparator": "NOT_EQUAL",
            "rhs": {
              "truncate_time": {
                "source": {"path_reference": {"value": "pdp.ts"}},
                "unit": "DAY",
                "timezone": {
                  "coalesce": {
                    "sources": [
                      {"path_reference": {"value": "customer.timezone"}},
                      {"string_literal": {"value": "America/New_York"}}
                    ]
                  }
                }
              }
            }
          }
        }
      ]
    }
  }
}

Extraction of products

With the audience fully in place, we now would like to know the high-value products in which a given user showed prolonged interest. For this, we need to add the collections list to the root of the segment definition. In our case, we only need one entry in this list. The products have already been captured via tracked_data keyed as pdp, but it also contains the timestamp of the event. We only care about the unique set of products, not the individual interactions with those products, so we can first narrow down to just the product_id field, and then request unique values.

{
  "key": "high_interest_products",
  "source": {"key": "pdp"},
  "narrowed_sub_paths": ["product_id"],
  "unique": true
}

A GraphQL consumer can now access the high_interest_products via qualifications in the member audience, which also supports enrichment with additional product fields, including join-through fields from attached dimensions.