Query-time sampling allows you to query a subset of users and shorten the time it takes for a report to load results. Insights is the only report that currently supports sampling at query time.
This feature is available to enterprise customers with over 5 million MTUs.
Enable or Disable Query Time Sampling
Navigate to the report where you would like to enable or disable sampling at time of query.
From the report in which you would like to use sampling, click the lightning bolt in the upper right corner of the query builder.
This will enable sampling on the report, and will be indicated by the lightning bolt symbol turning blue. The percentage of the total that is included in the query calculations will be indicated in the top right corner of the query builder.
To turn off sampling, click the lightning bolt symbol in the upper right corner of the query builder again.
The lightning bolt symbol will turn grey to indicate that sampling is disabled.
Query Time Sampling Calculation and Presentation
Mixpanel will not sample, or drop, events at ingestion. Instead, Mixpanel will ingest all event data and sample at query time. This prevents the loss of important data, and therefore allows you to toggle sampling on and off depending on need.
For example, if you have a need for iterative querying, then sampling will greatly speed up this process. When you build the proper query, you can turn off sampling and query the entire dataset.
The following occurs when sampling is enabled:
- Mixpanel selects a uniformly random sample of users on which to run the analysis.
- The sample size is 10% of the total population.
- The report is generated using that subset of users.
- Mixpanel up-samples the data by multiplying by the inverse of the sampling factor. This is done for functions such as totals and uniques. Functions that do not scale with users (average, min, max) will not be up-sampled.
- The effect is that numbers should closely approximate results seen without sampling enabled. This works better as the number of users increases, particularly for customers with more than 5 million users.
- Mixpanel adds an annotation such as “Sampled on 10% of users” to reports.
Saved Reports with Query Time Sampling
If you save a report that uses query time sampling, then a version of the report without sampling is saved. This ensures that dashboards and saved reports are computed on the entire dataset for high fidelity.