4

This is raw data, and want to rank them according to score (count(tbl_1.id)).

[tbl_1]
===========
id | name  
===========
1  | peter
2  | jane
1  | peter
2  | jane
3  | harry
3  | harry
3  | harry
3  | harry
4  | ron

So make temporary table (tbl_2) to count score for each id.

SELECT id, name, COUNT( id ) AS score
FROM tbl_1
GROUP BY id
ORDER BY score DESC;
LIMIT 0, 30;

Then result is;

[tbl_2]
===================
id | name  | score
===================
3  | harry | 4
1  | peter | 2
2  | jane  | 2
4  | ron   | 1

Then query this;

SELECT v1.id, v1.name, v1.score, COUNT( v2.score ) AS rank
FROM votes v1
JOIN votes v2 ON v1.score < v2.score
OR (
v1.score = v2.score
AND v1.id = v2.id
)
GROUP BY v1.id, v1.score
ORDER BY v1.rank ASC, v1.id ASC
LIMIT 0, 30;

Then result is;

==========================
id | name  | score | rank
==========================
3  | harry | 4     |  1
1  | peter | 2     |  2
2  | jane  | 2     |  2
4  | ron   | 1     |  4

Is it possible to do this in one transaction (query) nicely?

chloe
  • 119
  • 11

1 Answers1

2

Yes, it's possible to do this in a single query. But it's a total hairball in MySQL, because MySQL doesn't have a simple ROWNUM operation, and you need one for the rank computation.

Here's your vote query with the rank shown. The @ranka variable is used to number the rows.

SELECT @ranka:=@ranka+1 AS rank, id, name, score
  FROM
  (
     SELECT id, 
            name, 
            COUNT( id ) AS score
       FROM tbl_1
       GROUP BY id
       ORDER BY score DESC, id
   ) votes,
  (SELECT @ranka:=0) r

As you have discovered already, you need to self-join this thing to get a proper ranking (which handles ties correctly). So, if you take your query and replace the two references to your votes table each with their own version of this subquery, you get what you need.

SELECT v1.id,
       v1.name,
       v1.score,
       COUNT( v2.score ) AS rank
  FROM (
         SELECT @ranka:=@ranka+1 AS rank,
                id,
                name,
                score
           FROM
              (
                SELECT id,
                       name,
                       COUNT( id ) AS score
                  FROM tbl_1
                 GROUP BY id
                 ORDER BY score DESC, name
               ) votes,
         (SELECT @ranka:=0) r) v1
  JOIN (
         SELECT @rankb:=@rankb+1 AS rank, 
                id, 
                name, 
                score
           FROM
              (
                SELECT id,
                       name,
                       COUNT( id ) AS score
                  FROM tbl_1
                  GROUP BY id
                  ORDER BY score DESC, name
              ) votes,
        (SELECT @rankb:=0) r) v2 
    ON (v1.score < v2.score) OR 
       (v1.score = v2.score  AND v1.id = v2.id)
 GROUP BY v1.id, v1.score
 ORDER BY v1.rank ASC, v1.id ASC
 LIMIT 0, 30;

Told you it's a hairball. Notice that you need different @ranka and @rankb variables in the two versions of the subquery that you're self-joining, to make the row numbering work correctly: these variables have connection scope, not subquery scope, in MySQL.

http://sqlfiddle.com/#!2/c5350/1/0 shows this working.

Edit: It's far easier to do this using PostgreSQL's RANK() function.

SELECT name, votes, rank() over (ORDER BY votes)
  FROM (
        SELECT name, count(id) votes
          FROM tab
         GROUP BY name
       )x

http://sqlfiddle.com/#!1/94cca/18/0

O. Jones
  • 103,626
  • 17
  • 118
  • 172
  • Thank you for the very clear explanation and the wonderful code! I tried the former code (SELECT @a, @a := @a+1 ...) but it didn't return correct ranking on same score. Thank you very much :) – chloe Aug 06 '12 at 19:27
  • The first chunk of code is a subset of what you need to make this work. It's the subquery that is repeated twice, taking the place of your `votes` table, in your otherwise correct query. It's OK to ACCEPT an answer on Stack Overflow if it is helpful. Click on the green checkbox. – O. Jones Aug 06 '12 at 21:36