I am running a query that returns the count distinct of active users per different products grouped by day and country. I also added two columns 'previous_day_active_users' and 'returning_users' which return the number of users active in the previous day and the number of users active both in the previous and the current day. I want to do this in order to calculate the retention rate (returning_users / previous_day_active_users). The problem is the numbers don't match. In addition, 'previous_day_active_users' and 'returning_users' always match, which doesn't make sense. I don't understand what I am doing wrong.
Query:
WITH active_users AS (
SELECT
date_trunc('day'::text, customer_kpis_hourly.time_eet) as time_eet,
customer_kpis_hourly.market_code,
customer_kpis_hourly.user_id,
customer_kpis_hourly.active,
LAG(customer_kpis_hourly.active) OVER (PARTITION BY user_id ORDER BY time_eet) AS prev_active,
customer_kpis_hourly.active_casino,
customer_kpis_hourly.active_virtual_sports,
customer_kpis_hourly.active_poker,
customer_kpis_hourly.active_sportsbook
FROM delivery.customer_kpis_hourly
WHERE customer_kpis_hourly.time_eet >= '2021-01-01'
AND customer_kpis_hourly.time_eet < '2021-01-05'
AND customer_kpis_hourly.active
),
previous_day_active_users AS (
-- Calculate previous day's active users
SELECT
user_id,
time_eet - INTERVAL '1 day' AS previous_day
FROM active_users
WHERE prev_active
)
SELECT
'D'::text AS time_gran,
date_trunc('day'::text, active_users.time_eet) AS time_eet,
active_users.market_code,
count(DISTINCT active_users.user_id) FILTER (WHERE active_users.active) AS active_user,
count(DISTINCT active_users.user_id) FILTER (WHERE active_users.active_casino) AS active_casino,
count(DISTINCT active_users.user_id) FILTER (WHERE active_users.active_virtual_sports) AS active_vs,
count(DISTINCT active_users.user_id) FILTER (WHERE active_users.active_poker) AS active_poker,
count(DISTINCT active_users.user_id) FILTER (WHERE active_users.active_sportsbook) AS active_sb,
count(DISTINCT previous_day_active_users.user_id) as previous_day_active_user,
count(DISTINCT CASE WHEN previous_day_active_users.user_id IS NOT NULL THEN active_users.user_id END) AS returning_users
FROM
active_users
LEFT JOIN
previous_day_active_users
ON active_users.user_id = previous_day_active_users.user_id
-- AND date_trunc('day'::text, active_users.time_eet) = previous_day_active_users.previous_day
GROUP BY
'D'::text, date_trunc('day'::text, active_users.time_eet), active_users.market_code