Data Analysis - How to Calculate and Analyze Retention Rate

Before the product manager does the retention rate analysis, first combine the product's business characteristics and the product's growth pattern, to determine whether the product's business objectives are related to the retention rate.

User growth can be divided into the following three modes: sticky growth, viral growth, and paid growth.

Sticky growth, cultivate user loyalty, to user retention. For example, in the case of content products such as Zhihu, Luo Zhusi, or Baidu's open platform, which is a To B business model, the maintenance of customer relations and user retention is very important. This is a very important part of the business model. The business goals of this type of growth model are based on user retention and user renewal rates.

Fission growth, in a short period of time through the user social network, spontaneous spread. This growth model is common in WeChat social products, Pinduoduo 2C trading products. The initial business goal of these products is to spread the cycle, the user growth rate.

Paid growth: the shift from free to paid, continue to pay to increase LTV; or directly pay for services. For example, impression notes, ProcessOn and other tools, or cell phones, cameras and other 3C tools. The business goal of such products is the conversion rate of user payment and the repurchase rate of the product.

The statistics of retention is a combined ensemble of the following three dimensions:

(1) New, Active

New retention, mainly analyzes the loyalty of new users to the product. If new features are added or optimized to improve the user experience during the time period analyzed, the change in new retention is a good measure of the value of this feature.

Active retention, the retention of active users, is an important way to monitor product quality/stickiness and understand the quality of a channel.

Generally active retention > active retention because active users are more loyal than new users.

(2) equipment, account

The general number of equipment and account number is the same, but there are two special cases: the first is the use of the same account on different devices to log in, the number of devices > the number of accounts, such as the same Taobao account in different cell phone login to use; the second is the same device to log in a different account, the number of accounts > the number of devices, such as the same The second is the same device login different account, the account number > device number, such as the same cell phone login two WeChat account.

(3) Nth day, N days

The choice of Nth day, N days, is largely related to the purpose of the analysis and the nature of the product. Generally products with a low number of uses in a long cycle, such as hotels and travel products, can be calculated using N-day retention (be sure to de-emphasize!) , but the N-day retention calculation tends to be very high, so the referability is small. In addition, N-day retention is mainly to analyze the churn, that is, over a period of time, how many devices/accounts have been active once after the churn, but with the increase of N, N-day retention will be more and more, that is, as time passes, the retention number is getting bigger and bigger, which is not common sense. Therefore, in practice, we rarely use N-day retention.

And the time period, generally concerned about the next day retention, 3-day retention, 7-day retention, 30-day retention, the specific time period according to the user's habit of using the product, the nature of the product, the purpose of analyzing data.

Combination of the three dimensions of the definition of retention, there are eight statistical methods to log in behavior as an example, specific products in the end is to achieve what conditions are considered to be done to retain specific definitions can be:

Note: statistical N-day retention, be sure to de-emphasis, otherwise the retention may be greater than 100%

7-day active device retention rate as an example of the calculation of retention rates:

(1) (1) the number of days to achieve the same retention, (2) the number of days to achieve the same retention, and (3) the number of days to achieve the same retention.

(1) 7 day day retention rate = ?100%, as identified by the black line with the number of active devices that have been active on the 4th active devices are still active on the 11th active devices divided by the number of active devices on the 4th.

(2) 7 Day Day Retention = ?100%, with the day of addition being Day 0 and the next day being Day 1. The reason for this calculation, this is because some products have time cycle restrictions on the use of habits, such as office products, low weekend usage, this calculation method can avoid the calculated retention rate is lower than normal. This calculation method can avoid the calculated retention rate being lower than normal, such as the red line marking with the number of active devices that have been active on the 4th and are still active on the 10th divided by the number of active devices on the 4th.

(3) 7-day retention = , the number of daily active devices that were active from day 2 to day 7 out of the number of devices that were active on day 1 divided by the number of devices that were active on day 1.

More date options are also available

If the broader retention anomaly, it can be further divided into product channels, user sources, user groups, sub-functions and other perspectives of specific analysis. Of course, the product, operation, technology, marketing each link will have an impact on retention, need to analyze from multiple angles, not to expand here.

This comparison can be analyzed from multiple perspectives, not limited to user groups, user sources, user behavior, product channels, product features and so on. A more refined retention analysis can also filter a user's behavior and then compare the retention analysis.

For example, if a feature's retention is higher than that of the broader market, or if a feature's retention rate is higher, then this feature is of greater value to the product, and can be adjusted through the product to enable more users to use such features and improve retention. Or if you compare the retention rates of male and female user groups and find that the retention rate of male users is lower than that of female users, you can analyze the reasons and optimize the product strategy, or you can adjust the product promotion channels to attract female users.

Analyze the following retention graph (also known as the pistol graph), through the data in this graph, you can see the overall retention of newly activated users, with the optimization of the product, there is no higher and higher retention rate, or with the passage of time, the viscosity of the product whether there is a decline,

There are two characteristics of the retention curve of the new users: a rapid decline in the early period, and a certain amount of time to enter the stable period.

If you want to improve the performance of your product's retention curve, you can start from the above two perspectives, shorten the time for users to enter the plateau period (activate as soon as possible), and allow more users to enter the plateau period (activate more), or optimize the product experience to recall the loss of users through the discovery of product problems. Of course, ultimately, you need to surprise the user with the product experience, so that the user can gradually form a habit and enhance the indispensability of the product.