2

I have a table in SQL (using postgres) with two columns: videoid , classification

videoid | classification
1       |20
1       |24
1       |24
1       |24
1       |24
2       |20
2       |20
2       |20
2       |20
2       |24
2       |24
3       |24
3       |24

I am trying to retrieve all videoid's where the most common classification is 24. (The answer should only be videoid 1 and 3) When i use the query: (found from How to select most frequent value in a column per each id group?)

    SELECT DISTINCT ON (videoid) videoid, most_frequent_species FROM (
  SELECT videoid, classification AS most_frequent_species, count(*) as _count
FROM userviewed
GROUP BY videoid, classification) a
ORDER BY videoid, _count DESC
;

I retrieve the result set:

videoid     | most_frequent_species
    1       |24
    2       |20
    3       |24

But when i try to add a WHERE CLAUSE:

WHERE classification = 24

i get:

videoid| most_frequent_species
1      |24
2      |24
3      |24

How do I create a query that would only retrieve

videoid | most_frequent_species
1       |24
3       |24
cornsnack
  • 23
  • 4

2 Answers2

2

You can make use of the having clause (essentially a post-group where clause), and the mode function:

select
    videoid
from
    userviewed
group by
    videoid
having
    mode() within group (order by classification) = 24
Scoots
  • 3,048
  • 2
  • 21
  • 33
1

One method uses distinct on and then filtering:

select *
from (select distinct on (videoid) videoid, category, count(*)
      from userviewed
      group by videoid, category
      order by videoid, count(*) desc
     ) vc
where category = 24;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786