0

Not sure if this question is duplicated yet or not.

I have a simplified table below

User Interest
Jason Art
Jason Sport
Sam Sport
Sam Art
Steve Sport
Desmond Sport
Tania Art

Here's the result that I want to achieve

Interest Count
Art 2
Sport 2
Both 2

I Managed to make a subquery to achieve the value for the Both data by this query

SELECT COUNT(USER) FROM (
SELECT User, COUNT(DISTINCT Interest) as interest_type FROM table WHERE interest_type = 2)

But for the user that are exclusively have Interest in Art and in Sport it's not separated.

  • what database is being used? "sql" is NOT enough to identify which dialeect of that language to use – Paul Maxwell Apr 05 '22 at 06:05
  • Well, I'm actually using Big Query and this is just a simplified sample of the data that I have. I'll add some tags for that, thanks for reminding – Rama Ramadhan Apr 05 '22 at 06:08

2 Answers2

1

You could use conditional aggregation here:

WITH cte AS (
    SELECT User,
           CASE WHEN COUNT(CASE WHEN Interest = 'Art' THEN 1 END) > 0 AND
                     COUNT(CASE WHEN Interest = 'Sport' THEN 1 END) > 0
                THEN 'Both'
                WHEN COUNT(CASE WHEN Interest = 'Art' THEN 1 END) > 0
                THEN 'Art'
                ELSE 'Sport' END AS Interest
    FROM yourTable
    GROUP BY User
)

SELECT Interest, COUNT(*) AS Count
FROM cte
GROUP BY Interest;

On MySQL or BigQuery, we can shorten the above to:

WITH cte AS (
    SELECT User,
           CASE WHEN SUM(Interest = 'Art') > 0 AND SUM (Interest = 'Sport') > 0
                THEN 'Both'
                WHEN SUM(Interest = 'Art') > 0
                THEN 'Art'
                ELSE 'Sport' END AS Interest
    FROM yourTable
    GROUP BY User
)

SELECT Interest, COUNT(*) AS Count
FROM cte
GROUP BY Interest;
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • Thanks! this works. Picking this as the right answer just because it's faster by a few milliseconds than @PaulMaxwell answer! (both have the same billable bytes on bigquery) – Rama Ramadhan Apr 05 '22 at 06:25
1

Assuming your database supports the over() clause:

select
      case when num_interests = 1 then interest else 'both' end as interest
    , count(distinct user) as "Count"
from (
    select
      interest
    , user
    , count(*) over(partition by user) as num_interests
    from yourTable
    ) d
group by
      case when num_interests = 1 then interest else 'both' end
Paul Maxwell
  • 33,002
  • 3
  • 32
  • 51
  • 1
    BigQuery does support `over()` clause :D and this works! But, picking @TimBiegeleisen for the right one just because it's faster by a few milliseconds. (both have the same billable bytes on BQ) – Rama Ramadhan Apr 05 '22 at 06:27