Guides
Submit Documentation FeedbackJoin Developer CommunityLog In

Migrating to Enriched Events

❗️

Deprecation Notice

Enriched Events Export replaces Results Export and Raw Events Export as Optimizely's source-of-truth events dataset. Results Export and Raw Events Export will no longer be supported beginning November 15, 2020. Please refer to the Enriched Events Export documentation for how to get started.

Migration Steps

  1. Update how you access your AWS S3 buckets, by using our new Authentication API.
  2. Update existing scripts, or create a new task to pull data from the Enriched Events Export S3 buckets, using the new data schemas.
  3. Update the data transformations and queries you are performing on top of your export datasets in your data warehouse.
  4. (optional) Update the historical Raw Events export data in your systems to match the new Enriched Events schemas.

Raw Events Export ⟶ Enriched Events Export

In the Enriched Events Export, we’ve removed several fields that were unused in either the decision or conversion event type, renamed a few keys to make their values clearer, and added fields like process_timestamp to better help you find the events relevant to you. These new events will be cleaner, contain fewer null values, and be easier to understand.

Enriched Events Export will produce fewer rows than the Raw Events Export. This is because we’ve employed a more efficient schema and partitioning logic, that splits out decisions and conversions into their own datasets. In the Raw Events Export, if a conversion event is attributed to X experiments, we will export X copies (i.e. generate X rows) of that event, distributed to an equal number of experiment partitions (i.e. S3 folders). In E3, we will export that event once, and store the attributed experiment info in the experiments array (see E3 schemas here).

Raw Event to Enriched Event Decision:
We’ve added process_timestamp to the Enriched decision event, renamed the end_user_id to visitor_id, and removed event_type, event_name, event_features, and event_metrics, all of which were null in the Raw decision events.

Raw Event to Enriched Event Conversion:
We’ve made some of these more explicit, calling event_feature tags, user_features attributes, and splitting out event_metrics into revenue, value, and quantity. We’ve also made entity_id available to you, in case you haven’t defined a human-readable event name for your conversion. Experiment ID, variation ID, campaign ID, and holdback have been collapsed into the experiments column.

Data Comparisons
Here are a few examples of correlated queries you can run against Enriched Events Export and Raw Export, to verify that the data between them remains consistent. Tolerance thresholds range from 1-5%.

Unique Visitor Counts

--E3

SELECT COUNT(distinct visitor_id) AS unique_visitors
FROM e3_decisions
WHERE experiment_id = '10728121502'
  AND timestamp BETWEEN '2020-06-01 01:00:00.000' 
    AND '2020-06-01 23:00:00.000'
  AND is_holdback = false
--RAW
SELECT COUNT(distinct end_user_id) AS unique_visitors
FROM raw_export
WHERE experiment_id = '10728121502'
  AND timestamp BETWEEN 1590973200000 AND 1591052400000
  AND event_type IS NULL
  AND event_name IS NULL

Conversion Counts

--E3

SELECT COUNT(*) AS event_count 
FROM e3_conversions 
LATERAL VIEW explode(experiments) t AS exp
WHERE exp.experiment_id = '10554104820' 
  AND timestamp BETWEEN '2020-06-01 01:00:00.000' 
    AND '2020-06-01 23:00:00.000' 
  AND entity_id = '10597072554'
  AND event_name = ‘feature-click’
--RAW

SELECT COUNT(*) AS event_count
FROM raw_export
WHERE experiment_id = '10728121502'
  AND timestamp BETWEEN 1590973200000 AND 1591052400000
  AND event_name = ‘feature-click’

Note: You can find non-attributed events in the Enriched Events dataset by adding the clause WHERE cardinality(experiments)=0 to your query.