BigQuery

Contents: [ Schema , Table Examples , Nested and Repeated Fields ]
[Using One Table for All Events, Using One Table for Each Event, Note , One Table for All Events, One Table for Each Event, Note ]

Visit the Dev Docs

A better version of this page exists at https://developer.mixpanel.com/docs/mixpanel-bigquery-export-design.

This guide describes how Mixpanel exports your data to a Google BigQuery dataset.

You must provide a Google group email address to use the BigQuery export by using Mixpanel's Data Warehouse Export API.

Mixpanel exports transformed data into BigQuery at a specified interval.

Mixpanel creates a dataset in its own BigQuery instance and gives "View" access to the account(s) provided at the time of creating the pipeline.

Mixpanel applies transformation rules to make the data compatible with data warehouses.

Schema

This section describes the schemas for BigQuery. Event data and user data are loaded into separate BigQuery tables. User profile data is loaded into one table, and events are loaded into either a single table for each event or one table for all events.

Using One Table for All Events

You can export all of your Mixpanel data into a single BigQuery table. Mixpanel recommends a single table schema because it makes querying the data simpler.

If you select the single table schema, Mixpanel creates a mp_master_event table.

The table has one column per unique property name across all events in the history of the dataset.

Suppose you have an event with a purchase_amount property and one with a referrer property, the table will have a purchase_amount column and a referrer column. Events that don’t have a property contain a NULL value in that column.

For examples of one table for all events, see One Table for All Events.

Using One Table for Each Event

Note

One table for each event is not available during the trial period.

Each Mixpanel event can be loaded into its own table.

If you select the table per event option, Mixpanel creates one table per unique event name. Each table is named for the event. For example, if you have an event called signup, the table name is signup.

Each table for each event has one column per unique property name across all events in the history of the dataset.

For an example of one table for each event, see One Table for Each Event.

Table Examples

Here is an example dataset with different schema options in either one table for all events or one table for each event.

{
       “event”: “Signup”, 
       “properties”: {
       “plan”: “free”, 
       “browser”: “chrome”, 
       “distinct_id”: 1, 
       “time”: 123
       }
}
{
       “event”: “Purchase”, 
       “properties”: {
       “product_id”: “123”, 
       “browser”: “chrome”, 
       “distinct_id”: 1, 
       “time”: 124
       }
}
{
       “event”: “Signup”, 
       “properties”: {
       “plan”: “paid”, 
       “browser”: “firefox”, 
       “ab_test”: 1, 
       “distinct_id”: 2, 
       “time”: 125
       }

One Table for All Events

Single table: mixpanel_events

mp_event_name
time
distinct_id
plan
browser
product_id
ab_test

“Signup”

123

1

“free”

“chrome”

NULL

NULL

“Purchase”

124

1

NULL

“chrome”

“123”

NULL

“Signup”

125

2

“paid”

“firefox”

NULL

1

One Table for Each Event

Two tables: signup and purchase

Table: signup

time
distinct_id
plan
browser
ab_test

123

1

“free”

“chrome”

125

2

“paid”

“firefox”

1

Table: purchase

time
distinct_id

124

1

“chrome”

“123”

Nested and Repeated Fields

Note

Nested and repeated fields are not supported during the trial period. The values are flattened currently.

BigQuery supports nested and repeated fields. Mixpanel exports array properties as repeated fields and complex objects as nested fields into BigQuery.

Is this article helpful?

Comments

0 comments

Please sign in to leave a comment.