MySQL : getting top 100 records for each category

Question

I have a table called: consultant_project_score which has the following fields id, project_id, final_score. I want to generate a query which displays top 100 final_score for each project_id. I am using MySQL workbench 6.3 and I am not able to use any create/insert statements. Is there any way to achieve this?

Possible duplicate of [Using LIMIT within GROUP BY to get N results per group?](http://stackoverflow.com/questions/2129693/using-limit-within-group-by-to-get-n-results-per-group) — JimmyB, Apr 26 '16 at 09:51

Mihai · Answer 1 · 2016-04-26T10:08:28.887

1

Not tested,but it should work.The trick is to generate a row number for each project

select id, project_id, final_score
from 
(select id, project_id, final_score,
      @rn:= if(@project_id= project_id, @rn+ 1, 1) as rn,
      @project_id:= project_id as pi
  from t,(SELECT @rn:=0) b
  order by project_id, final_score desc) x 
where x.rn<= 100;

edited Apr 26 '16 at 10:08

answered Apr 26 '16 at 09:42

Mihai

26,325
7
66
81

I tried your query, I assume that 't' stands for consultant_project_score. Anyway it generated around 9milion records while I only have around 2000 projects in my DB, so it should have been around 200.000 records – Marc Zaharescu Apr 26 '16 at 09:53

Utsav · Answer 2 · 2016-04-26T10:00:23.847

1

This will work for you. The inner query will generate rank group by project_id and then outer query will filter the top 100 results.

Note: It will not work if you have similar combination of project_id and score. Created this fiddle to get 2 ranks.

http://sqlfiddle.com/#!9/9e94d5/1

SELECT id1,
       project_id,
       final_score
FROM
  (SELECT max(t1.id) AS id1,
          t1.project_id,
          t1.final_score,
          count(*) AS rnk
   FROM consultant_project_score t1
   INNER JOIN consultant_project_score t2 ON t1.project_id=t2.project_id
   WHERE t1.final_score<=t2.final_score
   GROUP BY t1.project_id,
            t1.final_score) t
WHERE rnk<=100

edited Apr 26 '16 at 10:00

answered Apr 26 '16 at 09:53

Utsav

7,914
2
17
38

I'm trying out your query. It's been running for about 10 min now. – Marc Zaharescu Apr 26 '16 at 10:03
The query will be slow as we are doing self join with group by. In MYSql, there are no `rank over partition by` else the query would be faster. You can try removing `max(t1.id)` from select clause. Also is possible, index `project_id` and `final_score`. – Utsav Apr 26 '16 at 12:20

score 0 · Answer 3 · answered Apr 26 '16 at 10:13

This one is without Group BY and uses the concept of self join . This should give you correct output

SQL

   SELECT cps2.id, cps2.project_id,  cps2.final_score FROM 
                    (  SELECT DISTINCT project_id FROM  consultant_project_score ) cps1 
                     JOIN  consultant_project_score cps2
                                            ON cps2.id  IN ( SELECT cps3.id FROM consultant_project_score cps3 
                            WHERE cps3.project_id = cps1.projectid  
                             ORDER BY cps3.final_score  DESC LIMIT 100) 
                    ORDER BY cps2.project_id ,cps2.final_score DESC

Error Code: 1235. This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery' 0.000 sec — Marc Zaharescu, Apr 26 '16 at 10:23

MySQL : getting top 100 records for each category

3 Answers3