Retention FAQ

How is Retention calculated?

*As of 5/25/2021, we've improved segmented retention calculations so that they are now intervalized averages. As such, segmented retention queries are now consistently calculated with the same method Mixpanel uses for unsegmented retention queries.

When you create an unsegmented retention query (i.e. a retention query that is not segmented by a property or cohort), Mixpanel will automatically intervalize the retention calculation.

In other words, we will calculate the retention of each and every cohort based upon the selected birth interval unit (day/week/month cadence that the user performed the A action) and then summarize it into one line by taking the average of all complete buckets. You can see the retention of each individual interval by expanding the Average Retention column:

intervalized_retention.png

Segmented queries are now calculated with that same intervalized average method, but for each particular property value or cohort segment. You can now expand out any particular segment's name in a retention query to see the retention of each day, week, or month interval within that segment:

segemented_intervalized_retention.png

How is this different than before?

Prior to 5/25/2021, we calculated segmented retention in a non-intervalized manner. Instead of calculating each individual day/week/month cohort separately and averaging it, we would treat the entire time period as one single cohort to group the users who performed the events with a given property value in a single row.

segmented_nonintervalized.png

Why make the change?

We wanted to add intervalized averages to segmented retention queries for three key advantages:

  1. To no longer include users in incomplete periods: The non-intervalized method did not give all users an equal chance to qualify for the later retention buckets. Newer, more recent users to come into the query towards the end of your date range would not have enough time pass to have the opportunity to be retained in the later date buckets. This would sandbag the last retention buckets in your query. With intervalized retention, Mixpanel only counts the completed time periods into the Average Retention calculation so that all users have the same opportunity to be retained. 
  2. Clarity through consistency: We used to calculate segmented and unsegmented queries differently: unsegmented queries were intervalized and segmented queries used to be non-intervalized. This prior inconsistency made Retention reports more difficult to consume. It required users to context switch and it took an experienced user to understand what they are looking at once they add a breakdown property. This adds clarity by way of consistent expectations.
  3. More granular analysis: Users are now able to see how the individual birth interval cohorts are performing within a particular segment. This unlocks the ability to get more answers and dive deeper into particular cohorts of interest. 

How is the "Average" row for Retention reports calculated?

Mixpanel calculates the "Average" row values by taking the average of all the completed buckets, weighted by the number of users who enter. 

Let's walk through an example using this sample data:

Date Total Profile(s) Day 1 Retained Users
March 1 1000 500
March 2 500 400
March 3 700 10 (incomplete)

From our sample data, the average value for day 1 retained users is calculated as follows:

When examining a Retention report in % / Retention Rate view:

     C_March 1 = 500 / 1000 (conversion rate for March 1)

     C_March 2 = 400 / 500 (conversion rate for March 2)

     W_March 1 = 1000 / 1500 (weight for March 1)

     W_March 2 = 500 / 1500 (weight for March 2)

Weighted Average = C_March 1 * W_March 1 + C_March 2 * W_March 2

In our sample data, this computes to: (( 0.5 + 0.66 ) / ( 0.8 + 0.33 )) * 100 = 60%

When examining a Retention report in # / Absolute view:

     C_March 1 = 500 (count retained for March 1)

     C_March 2 = 400 (count retained for March 2)

     W_March 1 = 1000 / 1500 (weight for March 1)

     W_March 2 = 500 / 1500 (weight for March 2)

Weighted Average = C_March 1 * W_March 1 + C_March 2 * W_March 2

In our sample data, this computes to: (500 * (1000 / 1500)) + (400 * (500 / 1500)) = 466

Why does the Metric view in the Retention Trends report show?

The Metric view in the Retention Trends Report shows the last complete bucket for that retention trend. Metric charts are often used to look at the most up-to-date data value, which in this case, would be the last / most recently completed cell. For days where there isn't yet a completed cell, Mixpanel uses the closest completed cell for that day. 

Did this answer your question?

Comments

0 comments

Please sign in to leave a comment.