Problem description
I'm trying to get the information of how many active users I have in my app separated by the 2 or 3 latest versions of the app.
I've read some documentations and other stack questions but none of them was solving my problem (and some others had outdated solutions).
Examples of solutions I tried:
- https://support.google.com/firebase/answer/9037342?hl=en#zippy=%2Cin-this-article (N-day active users - This solution is probably the best, but even changing the dataset name correctly and removing the
_TABLE_SUFFIX
conditions it kept returning me a single column n_day_active_users_count = 0 ) - https://gist.github.com/sbrissenden/cab9bd3a043f1879ded605cba5005457 (this is not returning any values for me, didn't understand why)
- How can I get count of active Users from google analytics (this is not a good fit because the other part of my job is already done and generating charts on Data Studio, so using REST API would be harder to join my two solutions - one from BigQuery and other from REST API)
- Discrepancies on "active users metric" between Firebase Analytics dashboard and BigQuery export (this one uses outdated variables)
So, I started to write the solution out of my head, and this is what I get so far:
SELECT
user_pseudo_id,
app_info.version,
ROUND(COUNT(DISTINCT user_pseudo_id) OVER (PARTITION BY app_info.version) / SUM(COUNT(DISTINCT user_pseudo_id)) OVER (), 3) AS adoption
FROM `projet-table.events_*`
WHERE platform = 'ANDROID'
GROUP BY app_info.version, user_pseudo_id
ORDER BY app_info.version
Conclusions
- I'm not sure if my logic is correct, but I think I can use
user_pseudo_id
to calculate it, right? The general idea is: user_of_X_version/users_of_all_versions. (And the results are kinda close to the ones showing at Google Analytics web platform - I believe the difference is due to the date that I turned on the BigQuery integration. But.... I'd like some confirmation on that: if my logic is correct). - The biggest problem in my code now is that I cannot write it without grouping by
user_pseudo_id
(Because when I don't BigQuery says: "SELECT list expression references column user_pseudo_id which is neither grouped nor aggregated at [2:3]") and that's why I have duplicated rows in the query result - Also, about the first link of examples... Is there any possibility of a record with engagement_time_msec param with value < 0? If not, why is that condition in the where clause?