Enriched Events Export
Note
Check if this feature is available for your plan by reviewing Optimizely Plans.
Enriched Events Export gives you secure access to your Optimizely event data so you can analyze your experiment results with greater flexibility. The export includes a useful combination of events attributes:
- Raw metadata (event name, user IDs, etc) that you pass to Optimizely, without additional processing
- Enriched metadata that Optimizely adds such as experiment ID, variation ID, and session ID
For more information, see Data Specification.
Use Cases
Optimizely's Results Page helps you understand the performance of your hypothesis with your experiment metrics. So why would you want to export event data to use outside of the Results page? Here are several use cases:
QA or troubleshoot Optimizely’s results:
- Get accuracy assurance or troubleshoot event counts
- Audit the events (including impressions) you sent to Optimizely
Calculate your own metrics from Optimizely's experiment data:
- Analyze experiment data via SQL or other data querying methods
- Create experiment data dashboards using third-party visualization tools
- Compute advanced metrics such as funnel metrics (this event THEN that event) or compound metrics (this event AND that event) to extend your measurement strategies.
Combine Optimizely experiment data with outside data:
- Centralize data management by copying source-of-truth experiment data to a third-party data warehouse
- Join experiment data with other data sources to measure impact on a wider range of external metrics
Enable the development of applications on top of experiment data (e.g experiment notifications)
Service Overview
We continuously export event data received by your organization to Amazon S3. You can access the event data within one day of receipt, via Amazon’s API/CLI/SDK using credentials provisioned by Optimizely.
Before exporting the events we de-deduplicate them and enrich them with information useful for experiment analysis like what variation in an experiment the event is attributed to (attribution), and during what session of the user did this event occur in (sessionization).
Get Started
Acquire Credentials
Follow Access Optimizely export data via Amazon S3 to request that Optimizely Support grant you credentials to access your data.
Note
If you already have credentials for the older data export services, you can skip this step. You can reuse the same credentials to access your Enriched Events Export data.
Validate your Credentials
Verify you can access export data using the credentials you got from Optimizely Support:
- Install AWS command line tools.
- Run aws configure to input your access key and secret access key.
- Copy your Optimizely account ID (can be found under account settings)
- List the files in your S3 directory, using your account id:
- aws s3 ls s3://optimizely-events-data/v1/account_id=/
Note: The final forward-slash is necessary because your credentials will only provide access to one folder inside the Optimizely Amazon S3 bucket.
- To actually inspect and access the data, you'll need third-party tools. For more information see Access and inspect data.
You should see a list of all your projects that have running or completed experiments.
Troubleshooting
If you see the following message:
An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied
- Try running aws configure again and double-check that you copied and pasted your credentials correctly.
- Confirm that you’re including the final forward-slash after your account ID.
Data Specification
Event Types
Enriched Events Export gives you access to:
- Decision events (also known as impressions): special events that are fired when Optimizely “decides” that a visitor is bucketed into a certain experiment/variation pair.
- Conversion events (known simply as events): all other events that are fired when a visitor “converts” to a desirable action, such as a click, page view or purchase.
Your export contains the same decision and conversion event data that Optimizely uses to display your experiment results, or metrics, on the Results page. This means you can easily correlate your analysis with our Results page. For more information, see How Optimizely counts conversions.
Event Content
Your event data contains:
Raw data as sent by the Optimizely client or Event API, including (but not limited to):
- event timestamp
- visitor ID
- event (entity) ID and event name
- user attributes
- event tags
- misc visitor and event metadata (user-agent etc.)
Enriched data generated by Optimizely’s server:
- (server) process timestamp
- session ID
- experiment ID
- variation ID
Please see schema for a full reference of the event data.
AWS S3 Partitioning
Your event data is exported in the following S3 partitions:
Decisions:
s3://optimizely-events-data/v1/account_id=<account_id>/type=decisions/date={YYYY-MM-DD}/experiment=<experiment_id>
Conversions:
s3://optimizely-events-data/v1/account_id=<account_id>/type=events/date={YYYY-MM-DD}/event=<event_name>
Legend
optimizely-events-data
: S3 bucket nameaccount_id
: Your unique account identifierdate
: The creation date of the dataexperiment_id
: Unique experiment identifier (used for the decisions partition)event_name
: Event (entity) identifier (used for the events partition)
Access and inspect data
The data is stored in Parquet-formatted file parts. The number of file parts is variable and depends on the experiment record volume. parquet-tool is a handy tool to inspect the file content and schema.
The data is encrypted using AWS-KMS. To access the data, you need one of the following clients:
- Amazon S3 Console
- AWS CLI version 1.11.108 or later
- AWS SDK released after May 2016
Retention Policy
Optimizely retains the files in your Enriched Events Export buckets for 1 year. Optimizely automatically deletes older data. We've implemented a GDPR workflow that complies with data deletion requirements laid out here.
Status File
The daily partition files are ready for import when the _SUCCESS
file is available at:
Decisions:
s3://optimizely-events-data/v1/account_id={account_id}/type=decisions/date={YYYY-MM-DD}/_SUCCESS
Events:
s3://optimizely-events-data/v1/account_id={account_id}/type=events/date={YYYY-MM-DD}/_SUCCESS
We recommend that you check for the presence of this file at regular intervals. If it is present, it is safe to start importing the daily partition data.
Schema
Decisions
Each row represents a single decision event.
field | type | description |
---|---|---|
uuid | string (guid) | Event UUID generated by the client. Used to de-duplicate events |
timestamp | long (time millis) | Event timestamp in milliseconds |
process_timestamp | long (time millis) | Timestamp in milliseconds set by the server. Indicates when the event was processed by the server. |
visitor_id | string | User/visitor identifier set by the client |
session_id | string | Unique session id generated by the server |
account_id | string | Account identifier |
campaign_id | string | Unique campaign id |
experiment_id | string | Unique experiment id |
variation_id | string | Unique variation id |
attributes | array<id:string, name:string, type:string, value:string> | Array of user attributes (also known as segments) |
user_ip | string | User IP address |
user_agent | string | User agent |
referer | string | Client referer (page from which event was sent) |
is_holdback | boolean | Boolean indicating if the visitor is bucketed to the holdback variation |
revision | string | Client snippet revision |
client_engine | string | Client engine string (e.g ‘js’, ‘node-sdk’). |
client_version | string | Client version |
Conversions
Each row represents a single conversion event.
field | type | description |
---|---|---|
uuid | string (guid) | Event UUID generated by the client. Used to de-duplicate events |
timestamp | long (time millis) | Event timestamp in milliseconds |
process_timestamp | long (time millis) | Timestamp in milliseconds set by the server. Indicates when the event was processed by the server |
visitor_id | string | User/visitor identifier set by the client |
session_id | string | Unique session id generated by the server |
account_id | string | Unique account identifier |
experiments | array<campaign_id:string, experiment_id:string, variation_id:string, is_holdback:boolean> | Array of the campaign(s), experiment(s) and variation(s) the event is attributed to |
entity_id | string | Event entity identifier |
attributes | array<id:string, name:string, type:string, value:string> | Array of user attributes (also known as segments) |
user_ip | string | User IP address |
user_agent | string | User agent |
referer | string | Client referer (page from which event was sent) |
event_type | string | Event type (click, pageview or custom) |
event_name | string | Friendly event name (from client) |
revenue | long | Revenue (in cents) |
value | double | Value used to compute value/numeric metrics. |
quantity | long | Quantity metric value |
tags | map<key:string,value:string> | Key/value pair of event tags |
revision | string | Client snippet revision |
client_engine | string | Client engine string (e.g ‘node-sdk’) |
client_version | string | Client version |
SQL Examples
Here are some frequently-used queries to compute simple experiment measures using Enriched Events Export.
Unique Visitors
Count of Unique Visitors from Decision Events (Full-Stack Methodology)
-- Count of Unique Visitors (Full-Stack Methodology)
SELECT COUNT(distinct visitor_id)
FROM decisions
WHERE experiment_id = '17287133690'
AND timestamp between 1579039200000 AND 1579644000000
AND is_holdback = false
Count of Unique Visitors from all Events (Web A/B Methodology)
-- Count of Unique Visitors (Web A/B Methodology)
SELECT COUNT (distinct visitor_id)
FROM
(
SELECT visitor_id
FROM events
CROSS JOIN UNNEST(experiments) as t(exp)
WHERE exp.experiment_id='17287133690'
AND timestamp between 1579039200000 AND 1579644000000
UNION
SELECT visitor_id
FROM decisions
WHERE experiment_id = '17287133690'
AND timestamp between 1579039200000 AND 1579644000000
AND is_holdback = false
)
Unique Conversions
--Unique Conversions on event_id
SELECT COUNT(distinct visitor_id)
FROM events
CROSS JOIN UNNEST(experiments) as t(exp)
WHERE exp.experiment_id='10757480886'
AND timestamp between 1579039200000 and 1579644000000
AND entity_id = '11795515443'
Total Conversions
--Total Conversions on event_id
SELECT COUNT(*)
FROM events
CROSS JOIN UNNEST(experiments) as t(exp)
WHERE
exp.experiment_id='17287133690'
AND timestamp between 1579039200000 and 1579644000000
AND entity_id = '11099710611'
Total Revenue
--Total Revenue for event_id
SELECT
SUM(revenue)
FROM events
CROSS JOIN UNNEST(experiments) as t(exp)
WHERE
exp.experiment_id='17287133690'
AND timestamp between 1579039200000 and 1579644000000
AND entity_id = '17281432226'
Total Value
--Total Value for event_id
SELECT
SUM(value)
FROM events
CROSS JOIN UNNEST(experiments) as t(exp)
WHERE
exp.experiment_id='17287133690'
AND timestamp between 1579039200000 and 1579644000000
AND entity_id = '17281432226'
Updated almost 3 years ago