Guides
Submit Documentation FeedbackJoin Developer CommunityLog In

Data Specification

❗️

Deprecation Notice

Enriched Events Export replaces Results Export and Raw Events Export as Optimizely's source-of-truth events dataset. Results Export and Raw Events Export will no longer be supported beginning November 15, 2020. Please refer to the Enriched Events Export documentation for how to get started.

Technical details

The Results Export process generates files containing the results records created in the past 24 hours (00:00 - 23:59 UTC). The files are stored in Apache Parquet format, and the partitioning scheme is below.

AWS S3

This section describes the Results Export files that you’ll retrieve from your Optimizely AWS S3 bucket.

S3 Path

s3://optimizely-rex/{accountId}/results/yyyy/mm/dd/{experimentId}/{fileName}

Legend

  • optimizely-rex: S3 bucket name
  • accountId: Your unique account identifier
  • yyyy/mm/dd: Creation date of the export
  • experimentId: Unique experiment identifier
  • fileName: one or more file parts that stores the results records in Parquet format. The file_name follows the format filepartnum.snappy.parquet

Notes

  • The daily partition contains results records created on that day based on the received time of the events.
  • The number of file parts is variable and depends on the experiment record volume.
  • For Personalization campaigns, a separate subfolder exists that contains the holdback variation data:
  • s3://optimizely-rex/{account_id}/results-holdback/yyyy/mm/dd/{experimentId}/{fileName}

Status file

The daily partition files are ready for import when the _SUCCESS file is available at either:

  • s3://optimizely-rex/{accountId}/results/yyyy/mm/dd/_SUCCESS
  • s3://optimizely-rex/{accountId}/results-holdback/yyyy/mm/dd/_SUCCESS

Optimizely recommends that you check for the presence of this file at regular intervals. If it's present, it's safe to start importing the daily partition data.

Schema field descriptions

Each row in the export is a results record that stores session-aggregated conversion data. The table below provides descriptions for the fields in the Results Export schema.

Retrieve identifiers

To retrieve the campaignId, experimentId, and variationId:

  • Web Experimentation and Personalization: The identifiers are in the API Names tab.
  • Full Stack: The identifiers are in the project's data file.
  • Alternatively, you can programmatically retrieve all identifiers via the REST API.

To retrieve the eventId: The event ID can be found in the Manage Events dialog in the Optimizely application, or you can programmatically retrieve it via the REST API.

Field name and type

Description

id
string

Unique record identifier.

timestamp
long

The record timestamp represented as the number of milliseconds since Unix epoch. It roughly corresponds to the client timestamp of the first event in the record.

accountId
long

Unique account identifier.

projectId
long

Unique project identifier.

visitorId
string

The visitor identifier.
Web Experimentation and Personalization: The optimizelyEndUserId value stored in the Optimizely cookie.
Full Stack: The user ID provided by your app or service.

sessionId
int

Unique session identifier.
Optimizely generates this identifier during the event sessionization process.

campaignId
string

The unique campaign identifier.
For Web and Full Stack A/B experimentation, the campaignId is always equal to experimentId.

experimentId
long

Unique experiment identifier.

variationId
string

Unique variation identifier.

eventId
long

Unique event identifier; is the event entity Id in some Optimizely X products.
By convention, Impression events have event_id equal to experiment_id.
Filter by experiment_id to calculate total impression events for the results set.
Overall Revenue events have event_id equal to -1.
Filter by -1 to calculate Overall Revenue for the results set.

count
int

The count of conversions in the record.

revenue
long

The total revenue summed across all events in the record.

value
float

The total value summed across all events in the record; used for total value and other numeric metrics.

segments
array<id(long):value(string)>

An array of segments in the form of segment_id:value.
Note: The array contains default or custom attributes attributed to the visitor at that session, enabling you to segment your results by one or more of these attributes.
See Segment your results in Optimizely X Web and Custom Attributes: Capture visitor data through the API in Optimizely X.

receivedTimestamp
long

The timestamp of when the record was created represented as the number of milliseconds since Unix epoch.
It approximates the time Optimizely received the first event in the record.

Usage notes

🚧

Important

The Results Export exports a daily copy of the results data. Any changes made to the Results page after a daily export has taken place will be reflected on the Results page retroactively but will not be included in the exported data. Examples of this are:

  • If you reset the results, Optimizely resets data on the Results page but not in the exported data.
  • If you add a metric, Optimizely retroactively calculates the metric on the Results page but doesn't include it in the exported data.

When using the Results Export to analyze cross-product results, be aware of these differences:

  • Web and Full Stack A/B testing: Uses visitor-counting, so you must use visitor-aggregated queries in the export to match the results.
  • Personalization: Uses session-counting, so you must use session-aggregated queries to match the results.

See the examples in SQL Examples for Common Metrics Calculations. For more information on how Optimizely counts conversions and to understand the data pipeline, see the KB articles noted in Resources.

Retention Policy

To comply with GDPR requirements, Optimizely retains the files in your Data Export bucket for both export services for 30 days. Older data is automatically deleted. To retain the data for a longer period of time, ensure that your import process archives the files to your data warehouse at least once every 30 days.

Encryption

Your Data Export data is encrypted. To access the data, you need one of the following clients, at minimum:

  • Amazon S3 console
  • AWS CLI version 1.11.108 and later
  • AWS SDKs released after May 2016