0

I am running this query where I calculate :

  • the number of active users in the previous period
  • the number of returning users (users active in the previous period and in the current period)
  • retention rate ( returning_users/previous_period_active_user)

The problem is that 'previous_period_active_user' and 'returning_user' return the correct results, but the 'retention_rate' always returns zero. I don't understand what I'm doing wrong. I am aware of the division by zero error in PostgreSQL and that's why I am using COALESCE and NULLIF functions.

Query:

SELECT
    au.time_gran,
    au.time,
    au.country,
    count(DISTINCT au.user_id) FILTER (WHERE au.active_1_period_before) AS previous_period_active_user,
    count(DISTINCT au.user_id) FILTER (WHERE au.active AND au.active_1_period_before) AS returning_users,
    COALESCE(
        count(DISTINCT CASE WHEN au.active AND au.active_1_period_before THEN au.user_id END) /
        NULLIF(count(DISTINCT au.user_id) FILTER (WHERE au.active_1_period_before), 0),
        0
    ) AS retention_rate
FROM active_users au
Steven
  • 19
  • 3

1 Answers1

1

When you use only integers in a division, you will get integer division.

count() return integer, so you will need to cast them to floats as part of the calculation (multiply by 1.0) :

SELECT
    au.time_gran,
    au.time,
    au.country,
    count(DISTINCT au.user_id) FILTER (WHERE au.active_1_period_before) AS previous_period_active_user,
    count(DISTINCT au.user_id) FILTER (WHERE au.active AND au.active_1_period_before) AS returning_users,
    COALESCE(
        count(DISTINCT CASE WHEN au.active AND au.active_1_period_before THEN au.user_id END) * 1.0 /
        NULLIF(count(DISTINCT au.user_id) FILTER (WHERE au.active_1_period_before) * 1.0, 0),
        0
    ) AS retention_rate
FROM active_users au
group by au.time_gran, au.time, au.country
SelVazi
  • 10,028
  • 2
  • 13
  • 29