Dev GuideAPI Reference
Dev GuideAPI ReferenceUser GuideProduct feedbackGitHubNuGetDev CommunitySubmit a ticketLog In

Enriched Events Export

📘

Note

Check if this feature is available for your plan by reviewing Optimizely Plans.

Enriched Events Export gives you secure access to your Optimizely event data so you can analyze your experiment results with greater flexibility. The export includes a useful combination of events attributes:

  • Raw metadata (event name, user IDs, etc) that you pass to Optimizely, without additional processing
  • Enriched metadata that Optimizely adds such as experiment ID, variation ID, and session ID

For more information, see Data Specification.

Use Cases

Optimizely's Results Page helps you understand the performance of your hypothesis with your experiment metrics. So why would you want to export event data to use outside of the Results page? Here are several use cases:

QA or troubleshoot Optimizely’s results:

  • Get accuracy assurance or troubleshoot event counts
  • Audit the events (including impressions) you sent to Optimizely

Calculate your own metrics from Optimizely's experiment data:

  • Analyze experiment data via SQL or other data querying methods
  • Create experiment data dashboards using third-party visualization tools
  • Compute advanced metrics such as funnel metrics (this event THEN that event) or compound metrics (this event AND that event) to extend your measurement strategies.

Combine Optimizely experiment data with outside data:

  • Centralize data management by copying source-of-truth experiment data to a third-party data warehouse
  • Join experiment data with other data sources to measure impact on a wider range of external metrics

Enable the development of applications on top of experiment data (e.g experiment notifications)

Service Overview

We continuously export event data received by your organization to Amazon S3. You can access the event data within one day of receipt, via Amazon’s API/CLI/SDK using credentials provisioned by Optimizely.

Before exporting the events we de-deduplicate them and enrich them with information useful for experiment analysis like what variation in an experiment the event is attributed to (attribution), and during what session of the user did this event occur in (sessionization).

2368

Click to enlarge

Get Started

Acquire Credentials

Follow Access Optimizely export data via Amazon S3 to request that Optimizely Support grant you credentials to access your data.

📘

Note

If you already have credentials for the older data export services, you can skip this step. You can reuse the same credentials to access your Enriched Events Export data.

Validate your Credentials

Verify you can access export data using the credentials you got from Optimizely Support:

  1. Install AWS command line tools.
  2. Run aws configure to input your access key and secret access key.
  3. Copy your Optimizely account ID (can be found under account settings)
  4. List the files in your S3 directory, using your account id:
  • aws s3 ls s3://optimizely-events-data/v1/account_id=/
    Note: The final forward-slash is necessary because your credentials will only provide access to one folder inside the Optimizely Amazon S3 bucket.
  1. To actually inspect and access the data, you'll need third-party tools. For more information see Access and inspect data.

You should see a list of all your projects that have running or completed experiments.

Troubleshooting

If you see the following message:
An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied

  • Try running aws configure again and double-check that you copied and pasted your credentials correctly.
  • Confirm that you’re including the final forward-slash after your account ID.

Data Specification

Event Types

Enriched Events Export gives you access to:

  • Decision events (also known as impressions): special events that are fired when Optimizely “decides” that a visitor is bucketed into a certain experiment/variation pair.
  • Conversion events (known simply as events): all other events that are fired when a visitor “converts” to a desirable action, such as a click, page view or purchase.

Your export contains the same decision and conversion event data that Optimizely uses to display your experiment results, or metrics, on the Results page. This means you can easily correlate your analysis with our Results page. For more information, see How Optimizely counts conversions.

Event Content

Your event data contains:

Raw data as sent by the Optimizely client or Event API, including (but not limited to):

  • event timestamp
  • visitor ID
  • event (entity) ID and event name
  • user attributes
  • event tags
  • misc visitor and event metadata (user-agent etc.)

Enriched data generated by Optimizely’s server:

  • (server) process timestamp
  • session ID
  • experiment ID
  • variation ID

Please see schema for a full reference of the event data.

AWS S3 Partitioning

Your event data is exported in the following S3 partitions:

Decisions:

s3://optimizely-events-data/v1/account_id=<account_id>/type=decisions/date={YYYY-MM-DD}/experiment=<experiment_id>

Conversions:

s3://optimizely-events-data/v1/account_id=<account_id>/type=events/date={YYYY-MM-DD}/event=<event_name>

Legend

  • optimizely-events-data: S3 bucket name
  • account_id: Your unique account identifier
  • date: The creation date of the data
  • experiment_id: Unique experiment identifier (used for the decisions partition)
  • event_name: Event (entity) identifier (used for the events partition)

Access and inspect data

The data is stored in Parquet-formatted file parts. The number of file parts is variable and depends on the experiment record volume. parquet-tool is a handy tool to inspect the file content and schema.

The data is encrypted using AWS-KMS. To access the data, you need one of the following clients:

  • Amazon S3 Console
  • AWS CLI version 1.11.108 or later
  • AWS SDK released after May 2016

Retention Policy

Optimizely retains the files in your Enriched Events Export buckets for 1 year. Optimizely automatically deletes older data. We've implemented a GDPR workflow that complies with data deletion requirements laid out here.

Status File

The daily partition files are ready for import when the _SUCCESS file is available at:

Decisions:

s3://optimizely-events-data/v1/account_id={account_id}/type=decisions/date={YYYY-MM-DD}/_SUCCESS

Events:

s3://optimizely-events-data/v1/account_id={account_id}/type=events/date={YYYY-MM-DD}/_SUCCESS

We recommend that you check for the presence of this file at regular intervals. If it is present, it is safe to start importing the daily partition data.

Schema

Decisions

Each row represents a single decision event.

fieldtypedescription
uuidstring (guid)Event UUID generated by the client. Used to de-duplicate events
timestamplong (time millis)Event timestamp in milliseconds
process_timestamplong (time millis)Timestamp in milliseconds set by the server. Indicates when the event was processed by the server.
visitor_idstringUser/visitor identifier set by the client
session_idstringUnique session id generated by the server
account_idstringAccount identifier
campaign_idstringUnique campaign id
experiment_idstringUnique experiment id
variation_idstringUnique variation id
attributesarray<id:string, name:string, type:string, value:string>Array of user attributes (also known as segments)
user_ipstringUser IP address
user_agentstringUser agent
refererstringClient referer (page from which event was sent)
is_holdbackbooleanBoolean indicating if the visitor is bucketed to the holdback variation
revisionstringClient snippet revision
client_enginestringClient engine string (e.g ‘js’, ‘node-sdk’).
client_versionstringClient version

Conversions

Each row represents a single conversion event.

fieldtypedescription
uuidstring (guid)Event UUID generated by the client. Used to de-duplicate events
timestamplong (time millis)Event timestamp in milliseconds
process_timestamplong (time millis)Timestamp in milliseconds set by the server. Indicates when the event was processed by the server
visitor_idstringUser/visitor identifier set by the client
session_idstringUnique session id generated by the server
account_idstringUnique account identifier
experimentsarray<campaign_id:string, experiment_id:string, variation_id:string, is_holdback:boolean>Array of the campaign(s), experiment(s) and variation(s) the event is attributed to
entity_idstringEvent entity identifier
attributesarray<id:string, name:string, type:string, value:string>Array of user attributes (also known as segments)
user_ipstringUser IP address
user_agentstringUser agent
refererstringClient referer (page from which event was sent)
event_typestringEvent type (click, pageview or custom)
event_namestringFriendly event name (from client)
revenuelongRevenue (in cents)
valuedoubleValue used to compute value/numeric metrics.
quantitylongQuantity metric value
tagsmap<key:string,value:string>Key/value pair of event tags
revisionstringClient snippet revision
client_enginestringClient engine string (e.g ‘node-sdk’)
client_versionstringClient version

SQL Examples

Here are some frequently-used queries to compute simple experiment measures using Enriched Events Export.

Unique Visitors

Count of Unique Visitors from Decision Events (Full-Stack Methodology)

-- Count of Unique Visitors (Full-Stack Methodology)
SELECT COUNT(distinct visitor_id)
FROM decisions
WHERE experiment_id = '17287133690'
AND timestamp between 1579039200000 AND 1579644000000
AND is_holdback = false

Count of Unique Visitors from all Events (Web A/B Methodology)

-- Count of Unique Visitors (Web A/B Methodology)
SELECT COUNT (distinct visitor_id)
FROM 
(
     SELECT visitor_id
     FROM events
     CROSS JOIN UNNEST(experiments) as t(exp)
     WHERE exp.experiment_id='17287133690'
     AND timestamp between 1579039200000 AND 1579644000000
     UNION
     SELECT visitor_id
     FROM decisions
     WHERE experiment_id = '17287133690'
     AND timestamp between 1579039200000 AND 1579644000000
     AND is_holdback = false
)

Unique Conversions

--Unique Conversions on event_id

SELECT COUNT(distinct visitor_id)
FROM events
CROSS JOIN UNNEST(experiments) as t(exp)
WHERE exp.experiment_id='10757480886'
   AND timestamp between 1579039200000 and 1579644000000
   AND entity_id = '11795515443'

Total Conversions

--Total Conversions on event_id

SELECT COUNT(*)  
FROM events
CROSS JOIN UNNEST(experiments) as t(exp)
WHERE 
exp.experiment_id='17287133690'
AND timestamp between 1579039200000 and 1579644000000
AND entity_id = '11099710611'

Total Revenue

--Total Revenue for event_id

SELECT 
   SUM(revenue)  
FROM events
CROSS JOIN UNNEST(experiments) as t(exp)
WHERE 
exp.experiment_id='17287133690'
AND timestamp between 1579039200000 and 1579644000000
AND entity_id = '17281432226'

Total Value

--Total Value for event_id

SELECT 
   SUM(value)  
FROM events
CROSS JOIN UNNEST(experiments) as t(exp)
WHERE 
exp.experiment_id='17287133690'
AND timestamp between 1579039200000 and 1579644000000
AND entity_id = '17281432226'