-3

DB is BigQuery, but it's very similar to most DB's, and I would imagine this answer will be universal.

The Table

user_id | date | category_id 
----------------------------
1       | xx   | 10
2       | xx   | 10
2       | xx   | 10
3       | xx   | 10
3       | xx   | 10
3       | xx   | 10
1       | xx   | 11
2       | xx   | 12

I would like to get a count of distinct user_id's per category_id

So the answer would produce :

category_id | distinct_user_count
---------------------------------
10          | 3
11          | 1
12          | 1

My apologies ahead of time if this has been asked before.

I should add that both of these SQL queries return the same result :

SELECT 
  category_id,
  count(distinct user_id)
FROM t
group by category_id
SELECT 
  category_id,
  count(user_id) over(partition by category_id)
FROM t
group by category_id, user_id

Produces unexpected results :

category_id | distinct_user_count
---------------------------------
10          | 1
11          | 1
12          | 1
Trip
  • 26,756
  • 46
  • 158
  • 277

3 Answers3

2

Here's a simple solution.

select category_id
      ,count(distinct user_id) as distinct_user_count
from t
group by category_id
category_id distinct_user_count
10 3
11 1
12 1

Fiddle

DannySlor
  • 4,574
  • 1
  • 3
  • 11
1

You can use this query

SELECT 
    category_id,
    COUNT(DISTINCT user_id) AS distinct_user_count
FROM t
GROUP BY category_id
Alexey
  • 2,439
  • 1
  • 11
  • 15
  • Maybe its a BigQuery issue as this was my first answer as well. I updated my answer above to show what results this would bring. – Trip Aug 16 '22 at 14:33
  • Then you can try `SELECT category_id, EXACT_COUNT_DISTINCT(user_id) AS distinct_user_count FROM t GROUP BY category_id` – Alexey Aug 16 '22 at 14:58
  • Unfortunately `EXACT_COUNT_DISTINCT` is not a BigQuery method – Trip Aug 16 '22 at 14:58
0

Try this:

select  distinct count(user_id) over(partition by category_id) as "users_per_id", 
  category_id
from the_table
group by category_id, user_id
lsanchezo
  • 134
  • 8
  • Was my first guess as well, but I get the results I added to my question above.. perhaps it's a BigQuery thiing? – Trip Aug 16 '22 at 14:35
  • maybe, look at this [dbfiddle](https://dbfiddle.uk/?rdbms=postgres_14&fiddle=513f37ffa42aef72d0c658c1286458e0). it works well! – lsanchezo Aug 16 '22 at 14:36
  • Unfortunately, BigQuery for some torturous reason doesn't perform the same as Postgresql. But I agree with you, this should definitely have worked. – Trip Aug 16 '22 at 14:41