I can’t find a solution to the problem anywhere, even the great ZhPT can’t cope yet!) Maybe someone here has encountered a similar problem, I would be very grateful!
Data from Google Analytics 4 (GA4) is fed into bigquery. There, an SQL query is created for the retention table, which looks like below. The problem is that the number of users on day 0 is normal, but on day > 0 it is inadequately large. This is known, because GA4 itself creates a retention table with which you can compare, and also when compared with the table in appmetrica. And also because on 2 other applications this sign shows everything correctly and retention coincides with GA4 and appmetrica.
SQL-query:
WITH
table1 AS (
SELECT
user_pseudo_id AS user_pseudo_id,
MIN(event_date) AS birthday,
FROM
table
WHERE
event_name = 'first_open'
GROUP BY user_pseudo_id
)
, table2 AS (
SELECT
r.birthday AS birthday,
l.event_date AS event_date,
DATE_DIFF(l.event_date, r.birthday, DAY) AS diff,
l.user_pseudo_id AS user_pseudo_id
FROM
table AS l
JOIN
table1 AS r
ON
l.user_pseudo_id = r.user_pseudo_id
)
, table3 AS (
SELECT
birthday,
diff,
COUNT(DISTINCT user_pseudo_id) as count_users,
FIRST_VALUE(CASE WHEN diff = 0 THEN COUNT(DISTINCT user_pseudo_id) END) OVER (PARTITION BY birthday ORDER BY diff ASC) AS users_day0
FROM table2
GROUP BY
birthday,
diff
)
SELECT
birthday,
diff,
count_users,
users_day0,
count_users/users_day0 AS retention
FROM table3
I have used this SQL-query on other apps and it works great and shows correct results which match other dashboards. It seems to me that the problem arises precisely at the moment when users are considered with a difference in dates, because day 0 shows good values.