Introduction
There is no perfect data model, and we can’t always have all the data prepared exactly right to answer every possible question. Moreover, there are times when the relevant data doesn’t make it into Mixpanel or the data actually changes after being sent into Mixpanel, which makes answering specific behavioral questions very difficult, if not impossible.
Enter Lookup tables. Lookup tables provide the ability for Mixpanel customers to enrich their event and profile properties within Mixpanel with additional data with a simple CSV upload or via APIs. This is very similar to VLOOKUP (Excel, Google Spreadsheet) and Dimension tables (Data Warehouses).
Example
Let’s say that a customer tracks purchases within Mixpanel, but they only track ProductID as a property, but not Price, Category or Color. They can easily augment and enrich their data within Mixpanel by uploading a products table with ProductID, Price, Category and color, and they can immediately start analyzing purchases not just by ID, but also by these new attributes of price, category and color, which show up just like any other property, and this analysis can be done retroactively as long as the identifier (in this case, ProductID) is present on events.
With just ProductID, questions can only be asked using ProductID and not with desired attributes like category, price or color.
Users might have the mapping between ProductID to category, color and price in a CSV, and it might look something like this:
Once they map this lookup table to the Mixpanel property (here, ProductID), then the available data to ask questions is not just ProductID, but also category, color and price:
Now each of the questions with the additional attributes (category, color, price) can be answered:
Lookup tables v/s Group properties v/s User properties
Just like User profiles augment details about the User, Lookup tables augment details about any property within Mixpanel, and here’s a quick overview of this and how user profiles, group profiles and lookup tables are different:
Use-cases
-
Enrich property values
- If you are an e-commerce customer and are tracking Purchases by Product, and have only been tracking the ProductID and the Product price, but now want to analyze Purchases by Product Category (e.g. Clothing, Shoes…) and Product Sub-Category (“Mens”, “Womens”, “Children”), you can upload your product catalog as a lookup table within Mixpanel, and can do the analysis for all Purchases by Product Category and Product Sub-Category, even retroactively.
- If you are a marketplace platform, and are tracking a OrderCompleted, and want to do revenue analysis, but don’t have the Commission Rate that needs to be paid to the seller at the time of tracking the OrderCompleted, you can upload the lookup table with OrderID and Commission Rate into Mixpanel, and do a complete revenue analysis.
- If you are a media platform, and are tracking video plays but were only tracking the “video name” but not the “genre”, you can upload a lookup table with the mapping of “video name” and “genre” and then do the video engagement analysis by genre within seconds.
- Before Lookup table mapping
- After lookup table mapping:
- Before Lookup table mapping
-
Change property values
- If you are a media company, and are tracking video categories as a property, but your category names have changed (Thriller -> Suspense & Thriller), you will end up seeing two groups of values for the same videos - all the video watches prior to the change will show up as Thriller, and the rest of the video watches after the change will show up as Suspense & Thriller, and this makes accurate calculations pretty annoying. If you upload your category table as a lookup table with the old values and the new values (Categoryname: Thriller, Categoryname(new): Suspense & Thriller) and then use the new property (Categoryname(new), you can now do all the video analysis with the correctly mapped category names without needing any data cleanup.
- If you are a media platform and are tracking the attribution channels for your signed up users, but changed from tracking Google as “Google Ads” -> “Google Search” and Google is now showing up as two different segments, you can fix this by uploading a lookup table of the attribution channel mapping of “Attribution channel” -> “New attribution channel”:
- Before lookup table mapping:
- After lookup table mapping:
- Before lookup table mapping:
-
Filter by a long list of values
- If you are a media company, and are trying to analyze the performance of a subset of all the videos in your catalog that were part of an experiment, you’d want the ability to filter down the report with a long list of video names that were considered in that experiment. Choosing property values one at a time within the UI can be quite cumbersome, so you can upload a lookup table of videos with an additional property (e.g. Part of Experiment?) of whether or not the video was a part of the experiment, and then filter the report by that new property (Filter -> Video name -> Part of Experiment? -> true).
- Before lookup table mapping:
- After lookup table mapping:
- Before lookup table mapping:
- If you are a media company, and are trying to analyze the performance of a subset of all the videos in your catalog that were part of an experiment, you’d want the ability to filter down the report with a long list of video names that were considered in that experiment. Choosing property values one at a time within the UI can be quite cumbersome, so you can upload a lookup table of videos with an additional property (e.g. Part of Experiment?) of whether or not the video was a part of the experiment, and then filter the report by that new property (Filter -> Video name -> Part of Experiment? -> true).
Who has access?
All users will be able to upload and map a lookup table to an existing property temporarily from within a report, but only users with "Admin" or "Owner" roles will be able to make the mapping persistent in Lexicon for other users in the project to use.
Customers on the Free plan will be able to temporarily map a property to a lookup table, but not have the option of persisting the mapping. (see "Temporarily map property to lookup table from report" below).
Project Owners can delete any table in a project, and Admins can only delete their own.
Consumers and Analysts cannot delete lookup tables even if they own them.
Step-by-step guide
Step 0: Know how lookup tables get mapped to existing Mixpanel properties
- As mentioned earlier, lookup tables are a way to add new properties based on the existing properties that are already being tracked within Mixpanel, so you’ll need to know what is the base property that is already being tracked.
- In our earliest example, ProductID is the event property that was already being tracked within Mixpanel.
- This base property will be used as the join key to connect the contents of the lookup table with the events data within Mixpanel.
- Here’s a more descriptive representation of this:
Events in Mixpanel
Event |
DistinctID |
ProductID |
Purchase |
U1 |
1 |
Purchase |
U2 |
1 |
Purchase |
U1 |
2 |
Products lookup table
ProductID |
Product Category |
Product Color |
Product price |
1 |
jeans |
blue |
39.99 |
2 |
tshirt |
white |
25 |
3 |
tie |
red |
20 |
Events table with mapped product lookup table
Event |
DistinctID |
ProductID |
Product Category |
Product Color |
Product price |
Purchase |
U1 |
1 |
jeans |
blue |
39.99 |
Purchase |
U2 |
1 |
jeans |
blue |
39.99 |
Purchase |
U1 |
2 |
tshirt |
white |
25 |
The ProductID is the join key that connects the Mixpanel events to the lookup table.
Step 1: Configure the CSV that you want to upload into Mixpanel
- Open a new CSV in your choice of spreadsheet software (Excel, Google Spreadsheet..)
- The column that represents the join key within the products lookup table needs to be added as the first column in the CSV.
- The name for that 1st column DOES NOT NEED to be the same as the join key (in this case, the first column in the products lookup table didn’t need to be “ProductID”) - it just needs to be the first column in the CSV.
- Now add a new column for each new attribute that you are trying to get into Mixpanel.
- From our earlier example, Product Category, Product color and Product price will be its own columns.
- Please note: the names of the other columns are important and it should be the string that you want to show users within the product, so if you've added Product Category, Product color and Product price as column names, the lookup profile properties will show up as Product Category, Product color and Product price.
- Now add all the values as separate rows.
- Save this CSV/Download the CSV. XLS files are not supported at this time. The file needs to be comma-separated in order for this import to work.
- Please note: There cannot be rows with duplicate values of the join key, as it is a 1:1 map with the property within Mixpanel.
Right:
ProductID |
Product Category |
Product Color |
Product price |
1 |
jeans |
blue |
39.99 |
2 |
tshirt |
white |
25 |
3 |
tie |
red |
20 |
Wrong:
ProductID |
Product Category |
Product Color |
Product price |
1 |
jeans |
blue |
39.99 |
1 |
jeans |
white |
45 |
2 |
tie |
red |
20 |
In this case, the second row with the duplicate ProductId will be dropped during the upload process.
Step 2: Upload CSV and map to an existing property within Mixpanel
- Go to Lexicon
- Navigate to lookup tables
- Hit “Import" -> "Lookup table”
- Click "Import CSV" and select the lookup table from your local machine.
- Click "Select a property" under "Map to Property" and select which event or user property you'd like to map this lookup table to.
- Click "Save".
Step 3: Analyze!
- Go to the Mixpanel report where you are doing your analysis
- Select the operation that you want to do (Filter/Breakdown/Aggregate) and then select the join key property (in our earlier example, ProductID) and then select which of the new properties you want to use for the analysis.
Before the Lookup table mapping:
After the lookup table mapping:
You can re-upload a new version of the lookup table as often as you want. This replaces the lookup table that you have (not an update new rows or columns, but a complete replace) but keeps all the existing property mappings in place, so the properties that were mapped to an earlier version of the lookup table doesn't need to be re-mapped after the lookup table has been replaced.
Re-map the Mixpanel property to a different lookup table
- Navigate to the property that you want the lookup table to be joined against (aka your join key)
- Select the property
- "Unmap from Lookup table"
- Once it is unmapped, Select “Map to lookup table”
- Select the lookup table from the list of lookup tables
- Select “Apply”
Share lookup table with colleagues
Uploaded lookup tables are private by default (just like reports, dashboards and custom events), but just like the other entities, you can now share lookup tables with specific team members, team(s) or the entire organization.
- Navigate to the Lookup tables tab in Lexicon
- Click on the lookup table that you'd like to share
- Click "Share" and then select who you'd like to share the lookup table with
- Done.
Temporarily map property to lookup table from report
If you just want to map a property to a lookup table to test it out without making that mapping available for all users in the project, you can do so straight from Insights, Funnels, Flows and Retention. Users with "Analyst" and "Consumer" role can avail of this to upload and map a property to a lookup table temporarily.
- Navigate to one of the core reports
- Filter/Breakdown by the property
- Click the overflow menu -> Map to lookup table
- Either select one of the existing lookup tables that have been uploaded and that you have access to OR upload a new lookup table from there.
- Hit "Apply".
Again, this does not change or modify the property for other users in the project, so feel free to keep experimenting with new versions of the lookup table till you can answer your question. If you feel like the mapping will be beneficial for other users in the project, you can ask your project admin to make this mapping persistent so other users can also avail of this in their analyses.
FAQ
Are there any limitations to the size of the lookup tables?
We currently have a limitation of 100 MB per imported lookup table, and up to 5 million rows of data across all lookup tables uploaded. Just to be clear, you can use multiple lookup tables in your projects, but the total count of rows has to be less than 5 million rows across all uploaded CSVs.
What happens if queries with mapped properties to lookup tables start failing?
Your queries will start failing with one of these 3 reasons:
- Lookup table size is too large
- What is it? We're failing the query because one of the mapped properties in the query is referencing a lookup table that exceeds the 100MB limit. This roughly maps to about 1 million rows of 2-3 columns worth of data, but your mileage may vary.
- What can you do to fix this? Upload a smaller version of the CSV (or programmatically delete lookup profiles from the lookup table to get it to around the 1 million rows mark and try again)
- You're querying too many lookup tables in one query
- What is it? We're failing the query because you are referencing multiple lookup tables and the combined size of the lookup tables exceeds the 100MB limit.
- What can you do to fix this? Try to exclude referencing some of the mapped properties in that query. For e.g. if you have a mapped property for ProductID (to Product category, color) and CountryID (to Country name, Region, GDP per capita..), try to exclude one of the two mapped properties from this query and try again.
- Too many concurrent updates to the lookup table (limited to when you are using an API)
- What is it? Your queries are failing because the rate of programmatic updates to the lookup tables exceeds our limits.
- What can you do to fix this? Try to space out programmatic updates. Wait a few minutes for the updates to get processed and then try again.
How do I specify which column in the CSV should be used to join/match against the Mixpanel property?
By default, the first column in the CSV is treated as the join key. Which means if i'm adding a lookup profile for "ProductID" (Mixpanel event property), then I'll need to have the ProductID in the first column of the uploaded CSV. The name of that column doesn't have to be the same as the Mixpanel property (it doesn’t have to be “ProductID”); it just has to be the first column.
Can the property that I'm using to join with the lookup table be a list property?
YES! You can just map a list property to a lookup table the same way you'd map a string property. Here is a detailed article that talks about how filters and breakdowns work with list properties in Mixpanel (with and without lookup tables).
I see that numbers and dates in the lookup table are all being imported as strings. Is there a way to specify the data type of these properties when I'm importing the CSV?
At this moment, all the data in the lookup table is treated as strings, and users can typecast the values within each of the reports. In the future, we will provide the option to change data types of the lookup profile properties.
Can the value of a Lookup table property (lookup profile property) be a list?
From a CSV upload, no. Using APIs, yes. You can send a property of type list using the APIs and the lookup profile property will be treated as a list.
Can Custom properties be created from Lookup profile properties?
Yes
Can I map a custom property to a lookup table?
Yes
Can each Mixpanel property (join key) only be mapped to one lookup table?
Yes. One Mixpanel property can only map to ONE lookup table.
Can multiple properties map to the same lookup table?
Yes. For example, first_trip_city_id and last_trip_city_id can both map to the City lookup table, but one Mixpanel property cannot map to multiple lookup tables (if first_trip_city_id is mapped to "City" lookup table, it can't also map to "Region" lookup table, before unmapping from the "City" lookup table.)
Can I use APIs to import Lookup tables?
YES! Here is a pointer to the HTTP docs on how to use the HTTP API to update lookup tables.
Can I import lookup tables using import pipelines?
Not at the moment.
Can I export lookup profile properties with Export APIs / Pipelines?
Not at the moment.
If I had mapped a property to a lookup table, and then I delete the lookup table, does the property mapping get unmapped?
Yes.
If I had mapped a property to a lookup table, and then I deleted the lookup table, what happens to saved reports that used these lookup profile properties (e.g. ProductID → Product sub category)?
We throw an error saying that "Product sub category" is no longer available for use, and that they should use another property to filter/aggregate/breakdown on.
Do I need to have paid for User profiles to avail of Lookup tables and Lookup profile properties?
No. There is no such requirement. You can upload lookup tables and map those to your event properties without requiring user profiles.
Concepts
- Lookup table is a collection of lookup profiles. An existing event or profile property can be mapped to a lookup table (for e.g.
Song name
can be the event property in Mixpanel, which is mapped to the Songs lookup table). When using APIs to programmatically update lookup tables, group_key uniquely identifies the lookup table as a whole. - Lookup profile is a single profile that is mapped to a single mixpanel event/profile property value. A lookup profile consists of a join key and one or more lookup profile properties. So for example, the mixpanel property value of Song name = "Hey Jude" can be mapped to a single lookup profile in the Songs lookup table which could have the unique identifier of "Hey Jude", and properties of "Genre: pop rock" and "hashtags: [Beatles]".
- Join key is the primary identifier of the lookup profile, and this is what is used to join against the mapped Mixpanel event property. In the above example, "Hey Jude" is a join key for that lookup profile. When using APIs to programmatically update lookup tables, group_id refers to the join key.
- Lookup profile property is a property on a single lookup profile. For example, "Genre" is a lookup profile property key, and "pop rock" is a lookup profile property value for the lookup profile of "Hey Jude".
Implementation
You can now programmatically update lookup tables - visit Mixpanel's Developer Documentation to learn more. Here is a link to the Python SDK support to update lookup tables.
Comments
Please sign in to leave a comment.