Data specification
Data specification for Optimizely's Experimentation Events Export.
Experimentation Events Export gives you access to the following data:
- Decision events (also known as impressions) – Special events fired when Optimizely Experimentation "decides" that a visitor is bucketed into a certain experiment and variation pair.
- Decision events in Optimizely Feature Experimentation.
- Decision events in Optimizely Web Experimentation.
- Conversion events (also known as events) – Other events that are fired when a visitor "converts" to a desirable action, such as a click, page view, or purchase.
- Conversion events in Optimizely Feature Experimentation.
- Conversion events in Optimizely Web Experimentation.
Your export contains the same decision and conversion event data that Optimizely uses to display your experiment results, or metrics, on the Experimentation Results page. This means you can correlate your analysis with The Optimizely Results page. For information, see How Optimizely Experimentation counts conversions.
Event content
Your event data sent by the Optimizely client or Event API contains the following (but not limited to):
- Event timestamp (time zone is in UTC. The format is UNIX epoch milliseconds).
- Visitor ID.
- Event (entity) ID and event name.
- User attributes.
- Event tags.
- Miscellaneous visitor and event metadata (user-agent).
- Server process timestamp (time zone is in UTC. The format is UNIX epoch milliseconds).
Important
Optimizely no longer automatically generates the
session_id
field for decision events andexperiments
andsession_id
fields for conversion events as of October 9, 2023.For more information, refer to the Experimentation Analytics October 2023 release notes.
For the full reference of the event data, see the schema section.
Event data transformation
Optimizely pre-processes the data in the Experimentation Events Export to ensure accuracy, consistency, and usability for analysis. As part of this process, certain transformations may be applied to standardize event attributes, remove inconsistencies, and prevent data issues. The following transformations may be applied to the data:
- Remove duplicate events – Deduplicates events based on
uuid
. - Populate missing event names – Sets
event_name
toentity_id
ifevent_name
is blank. - Filter out incomplete conversions – Removes conversions unless at least one of the following conditions is met:
entity_id
is populated,revenue
is populated,value
is populated, orevent_type
isclient_activation
. - Replace oversized values – Substitutes
event_name
orexperiment_id
if the original value exceeds size limits.
Access and inspect Experimentation Events Export data
The data is stored in Parquet-formatted file parts. The number of file parts fluctuates and depends on the overall experiment record volume. parquet-tool is a handy tool for inspecting the file content and schema.
The data is encrypted using AWS-KMS. To access the data, you need one of the following clients:
- AWS CLI version 1.11.108 or later.
- The AWS SDK released after May 2016.
AWS S3 Partitioning
Your event data is exported in the following S3 partitions:
Decisions – s3://optimizely-events-data/v1/account_id=<account_id>/type=decisions/date={YYYY-MM-DD}/experiment=<experiment_id>
Conversions – s3://optimizely-events-data/v1/account_id=<account_id>/type=events/date={YYYY-MM-DD}/event=<event_name>
Legend
optimizely-events-data
– S3 bucket name.account_id
– Your unique account identifier.date
– The data creation date.experiment_id
– Unique experiment identifier (used for the decisions partition).event_name
– Event (entity) identifier (used for the conversions partition).
Status file
The daily partition files are ready for import when the _SUCCESS
file is available at:
Decisions – s3://optimizely-events-data/v1/account_id={account_id}/type=decisions/date={YYYY-MM-DD}/_SUCCESS
Conversions (Events) –s3://optimizely-events-data/v1/account_id={account_id}/type=events/date={YYYY-MM-DD}/_SUCCESS
You should check for this file at regular intervals. It is safe to start importing the daily partition data if it is present.
File structure
The Experimentation Events Export has the following file structure:
- Decisions – All visitor bucketing information for experiments.
- Experiment
- Day
- Experiment
- Events – (Conversions) All events tracked, such as click, hover, or anything else that might be configured.
- Event
- Day
- Event
Schema
Decisions
Each row represents a single decision event.
Important
Optimizely no longer automatically generates the
session ID
field as of October 9, 2023. For more information, refer to the Experimentation Analytics October 2023 release notes.
Field | Type | Description |
---|---|---|
uuid | string (GUID) | Event UUID generated by the client. Used to de-duplicate events. |
timestamp | long (time millis) | Event timestamp in milliseconds (in UTC). |
process_timestamp | long (time millis) | Timestamp in milliseconds set by the server. Indicates when the event was processed by the server (in UTC). |
visitor_id | string | The user or visitor identifier set by the client. |
session_id ** | string | Unique session ID. Optimizely no longer automatically generates the session_id as of October 9, 2023. |
account_id | string | Account identifier. |
campaign_id | string | Campaign ID. |
experiment_id | string | Experiment ID. Set to NULL if the user is evaluated for the experiment and fails the audience condition. |
variation_id | string | Unique variation ID. Set to NULL if the user is evaluated for the experiment and fails the audience condition. |
attributes | array<id:string, name:string, type:string, value:string> | An array of user attributes (also known as segments). |
user_ip | string | User IP address. |
user_agent | string | User-agent. |
referrer | string | Client referrer (the page from which the event was sent). |
is_holdback | boolean | A boolean indicating if the visitor is bucketed into the experiment based on the traffic allocation settings. If set to true , the user was excluded from the experiment. |
revision | string | Client snippet revision |
client_engine | string | Client engine string (for example, 'js' or 'node-sdk'). |
client_version | string | Client version |
activation_id | string | The activationTimeStamp is only available for Web Experimentation products (including Performance Edge).Used to calculate bounce and exit metrics. |
anonymize_ip | Boolean | Optimizely typically stores the client IP address for each request. If this flag is true, the last octet of the IP will be truncated before it is stored. If false, the entire IP address will be stored. This is most relevant for consumers of this API that are implemented in a web browser or mobile client context who are subject to policies or regulation restricting the storage of end-user identifying information. This flag is independent of the IP anonymization setting in the Account and Project settings, which only controls how Optimizely clients set this flag. If this flag is set, care must be taken when using the IP filtering feature, as fully-qualified explicit IP addresses will not function as filters (anonymization occurs before events are filtered by IP). |
** Optimizely no longer automatically generates the session ID
as of October 9, 2023. See the Experimentation Analytics October 2023 release notes
Conversions
Each row represents a single conversion event.
Important
Optimizely no longer automatically generates the
session ID
andexperiments
fields as of October 9, 2023. See the Experimentation Analytics October 2023 release notes.
Field | Type | Description |
---|---|---|
uuid | string (GUID) | Event UUID generated by the client. Used to de-duplicate events. |
timestamp | long (time millis) | Event timestamp in milliseconds (in UTC). |
process_timestamp | long (time millis) | Timestamp in milliseconds set by the server. Indicates when the event was processed by the server (in UTC). |
visitor_id | string | User identifier set by the client. |
session_id ** | string | Unique session ID. Optimizely no longer automatically generates the session ID as of October 9, 2023. |
account_id | string | Unique account identifier. |
experiments ** | array<campaign_id:string, experiment_id:string, variation_id:string, is_holdback:boolean> | An array of the campaigns, experiments, and variations the event is attributed to. Optimizely no longer automatically generates the experiments field as of October 9, 2023. |
properties | map <key:string, value:string> | Key value pairs defining properties or characteristics of the event. The value must be a string. |
entity_id | string | Event entity identifier. |
attributes | array<id:string, name:string, type:string, value:string> | An array of user attributes (also known as segments). |
user_ip | string | User IP address. |
user_agent | string | User-agent. |
referrer | string | Client referrer (the page from which the event was sent). |
event_type | string | Event type (click, pageview, custom, or client_activation). |
event_name | string | Friendly event name (from a client or 'multi-event' in case of multiple events). |
revenue | long | Revenue (in cents). |
project_id | string | (Optional) The project_id needs only to be passed if you are using the Optimizely Recommendations product. |
value | double | The value used to compute value or numeric metrics. |
quantity | long | Quantity metric value. |
tags | map<key:string,value:string> | Key/value pair of event tags. |
revision | string | Client snippet revision. |
client_engine | string | Client engine string (for example, 'node-sdk'). |
client_version | string | Client version. |
activation_id | string | The activationTimeStamp is only available for Web Experimentation products (including Performance Edge).Used to calculate bounce and exit metrics. |
anonymize_ip | Boolean | Optimizely typically stores the client IP address for each request. If this flag is true, the last octet of the IP will be truncated before it is stored. If false, the entire IP address will be stored. This is most relevant for consumers of this API that are implemented in a web browser or mobile client context who are subject to policies or regulation restricting the storage of end-user identifying information. This flag is independent of the IP anonymization setting in the Account and Project settings, which only controls how Optimizely clients set this flag. If this flag is set, care must be taken when using the IP filtering feature, as fully-qualified explicit IP addresses will not function as filters (anonymization occurs before events are filtered by IP). |
** Optimizely no longer automatically generates the session ID
and experiments
fields as of October 9, 2023. See the Experimentation Analytics October 2023 release notes.
Updated about 1 month ago