Data specification
Data specification for Optimizely's Experimentation Events Export.
Use the Experimentation Events Export when you need to analyze experiment data outside of Optimizely. For example, you might analyze the data in a data warehouse, business intelligence tool, or custom statistical pipeline. The Experimentation Events Export gives you access to the following data:
- Decision events (also known as impressions) – Events Optimizely Experimentation fires when it assigns a visitor to an experiment-and-variation pair.
- Decision events in Optimizely Feature Experimentation.
- Decision events in Optimizely Web Experimentation.
- Conversion events (also known as events) – Events Optimizely Experimentation fires when a visitor converts. For example, a click, page view, or purchase.
- Conversion events in Optimizely Feature Experimentation.
- Conversion events in Optimizely Web Experimentation.
Your export contains the same decision and conversion event data Optimizely uses to render the Experimentation Results page. Your analysis therefore aligns with what the Results page displays. For more, see How Optimizely Experimentation counts conversions.
Event content
Your event data includes the following fields, among others:
- Event timestamp.
- Visitor ID.
- Event (entity) ID and event name.
- User attributes.
- Event tags.
- Additional visitor and event metadata.
- Server process timestamp.
ImportantAs of October 9, 2023, Optimizely no longer generates the
session_idfield on decision events, or thesession_idandexperimentsfields on conversion events.For more, see the Experimentation Analytics October 2023 release notes.
For the complete field reference, see the Schema section.
Event data transformation
Optimizely preprocesses the data in the Experimentation Events Export to ensure accuracy, consistency, and usability. These transformations run on every export, so consumers downstream see the cleaned data, not the raw client events. The export applies the following transformations:
- Remove duplicate events – Deduplicates events based on
uuid. - Populate missing event names – Sets
event_nametoentity_idifevent_nameis blank. - Filter out incomplete conversions – Removes a conversion unless
entity_id,revenue, orvalueis populated, orevent_typeisclient_activation. - Replace oversized values – Substitutes
event_nameorexperiment_idif the original value exceeds size limits.
Access the export data
The data is stored in Parquet-formatted file parts. The number of file parts depends on your total event volume. Use parquet-tool to inspect file contents and schemas.
Your data is available in Amazon S3 and Google Cloud Storage (GCS). The format and schema are identical. Only the storage location and credentials differ. See the Authentication API for how to obtain credentials for each.
The S3 data is encrypted using AWS-KMS. To access it, you need one of the following clients:
- AWS CLI version 1.11.108 or later
- AWS SDKs released after May 2016
The GCS data is encrypted at rest by Google Cloud Storage. To access it, use the Google Cloud CLI (gcloud storage) or a Google Cloud Storage client library.
AWS S3 partitions
Your event data is exported in the following S3 partitions:
Decisions – s3://optimizely-events-data/v1/account_id=ACCOUNT_ID/type=decisions/date=YYYY-MM-DD/experiment=EXPERIMENT_ID
Conversions – s3://optimizely-events-data/v1/account_id=ACCOUNT_ID/type=events/date=YYYY-MM-DD/event=EVENT_NAME
Legend
optimizely-events-data– S3 bucket name.account_id– Your unique account identifier.date– The data creation date.experiment_id– Unique experiment identifier (used for the decisions partition).event_name– Event (entity) identifier (used for the conversions partition).
GCS partitions
Your event data is exported to a per-account GCS bucket in the following partitions:
Decisions – gs://optimizely-e3-ACCOUNT_ID/v1/account_id=ACCOUNT_ID/type=decisions/date=YYYY-MM-DD/experiment=EXPERIMENT_ID
Conversions – gs://optimizely-e3-ACCOUNT_ID/v1/account_id=ACCOUNT_ID/type=events/date=YYYY-MM-DD/event=EVENT_NAME
Legend
optimizely-e3-ACCOUNT_ID– GCS bucket name (one bucket per account).account_id– Your unique account identifier.date– The data creation date.experiment_id– Unique experiment identifier (used for the decisions partition).event_name– Event (entity) identifier (used for the conversions partition).
Status file
The daily partition files are ready for import when the _SUCCESS file is available.
Amazon S3
Decisions – s3://optimizely-events-data/v1/account_id=ACCOUNT_ID/type=decisions/date=YYYY-MM-DD/_SUCCESS
Conversions (Events) – s3://optimizely-events-data/v1/account_id=ACCOUNT_ID/type=events/date=YYYY-MM-DD/_SUCCESS
Google Cloud Storage
Decisions – gs://optimizely-e3-ACCOUNT_ID/v1/account_id=ACCOUNT_ID/type=decisions/date=YYYY-MM-DD/_SUCCESS
Conversions (Events) – gs://optimizely-e3-ACCOUNT_ID/v1/account_id=ACCOUNT_ID/type=events/date=YYYY-MM-DD/_SUCCESS
Poll for this file. When it is present, import the daily partition data.
File structure
The Experimentation Events Export has the following file structure:
- Decisions – Visitor bucketing information for experiments. Bucketing is the process of assigning a visitor to one of the variations in an experiment, based on their identifier and the experiment's traffic allocation.
- Experiment
- Day
- Experiment
- Events (conversions) – All tracked events, such as a click, a hover, or any other configured event.
- Event
- Day
- Event
Schema
Decisions
Each row represents a single decision event.
Field | Type | Description |
|---|---|---|
| string (GUID) | Event UUID generated by the client. Used to deduplicate events. |
| long (time millis) | Event timestamp – UTC, UNIX epoch milliseconds. |
| long (time millis) | Server process timestamp – UTC, UNIX epoch milliseconds. Indicates when the server processed the event. |
| string | The user or visitor identifier set by the client. |
| string | Unique session ID. As of October 9, 2023, Optimizely no longer generates |
| string | Account identifier. |
| string | Campaign ID. |
| string | Experiment ID. Set to |
| string | Unique variation ID. Set to |
| array<id:string, name:string, type:string, value:string> | An array of user attributes (also known as segments). |
| string | User IP address. |
| string | User-agent. |
| string | Client referrer (the page from which the event was sent). |
| boolean | Indicates whether traffic allocation excluded the visitor from the experiment. When |
| string | Client snippet revision. |
| string | Client engine string (for example, |
| string | Client version. |
| string | Available only for Web Experimentation products, including Performance Edge. |
| boolean | When set to This flag operates independently of the IP-anonymization setting in Account and Project settings. When |
** As of October 9, 2023, Optimizely no longer generates session_id. See the Experimentation Analytics October 2023 release notes.
Conversions
Each row represents a single conversion event.
Field | Type | Description |
|---|---|---|
| string (GUID) | Event UUID generated by the client. Used to deduplicate events. |
| long (time millis) | Event timestamp – UTC, UNIX epoch milliseconds. |
| long (time millis) | Server process timestamp – UTC, UNIX epoch milliseconds. Indicates when the server processed the event. |
| string | User identifier set by the client. |
| string | Unique session ID. As of October 9, 2023, Optimizely no longer generates |
| string | Unique account identifier. |
| array<campaign_id:string, experiment_id:string, variation_id:string, is_holdback:boolean> | An array of the campaigns, experiments, and variations the event is attributed to. As of October 9, 2023, Optimizely no longer generates the |
| map <key:string, value:string> | Key-value pairs that define properties of the event. The value must be a string. |
| string | Event entity identifier. |
| array<id:string, name:string, type:string, value:string> | An array of user attributes (also known as segments). |
| string | User IP address. |
| string | User-agent. |
| string | Client referrer (the page from which the event was sent). |
| string | Event type – one of |
| string | Human-readable event name from the client, or |
| long | Revenue (in cents). |
| string | (Optional) Pass |
| double | The value used to compute value or numeric metrics. |
| long | Quantity metric value. |
| map<key:string, value:string> | Key-value pairs of event tags. |
| string | Client snippet revision. |
| string | Client engine string (for example, |
| string | Client version. |
| string | Available only for Web Experimentation products, including Performance Edge. Used to calculate bounce and exit metrics. |
| boolean | When set to This flag operates independently of the IP-anonymization setting in Account and Project settings. When |
** As of October 9, 2023, Optimizely no longer generates session_id or experiments. See the Experimentation Analytics October 2023 release notes.
