3

This seems like such a simple question and I terrified that I might be bashed with the duplicate question hammer, but here's what I have:

ID  Date
1   1/11/01
1   3/3/03
1   2/22/02
2   1/11/01
2   2/22/02

All I need to do is enumerate the records, based on the date, and grouped by ID! As such:

ID  Date    Num
1   1/11/01 1
1   3/3/03  3
1   2/22/02 2
2   1/11/01 1
2   2/22/02 2

This is very similar to this question, but it's not working for me. This would be great but it's not MySQL.

I've tried to use group by but it doesn't work, as in

SELECT ta.*, count(*) as Num
FROM temp_a ta
GROUP BY `ID` ORDER BY `ID`;  

which clearly doesn't run since the GROUP BY always results to one value.

Any advice greatly appreciated.

Community
  • 1
  • 1
Monica Heddneck
  • 2,973
  • 10
  • 55
  • 89

1 Answers1

5

Let's assume the table to be as follows:

CREATE TABLE q43381823(id INT, dt DATE);
INSERT INTO q43381823 VALUES
(1, '2001-01-11'),
(1, '2003-03-03'),
(1, '2002-02-22'),
(2, '2001-01-11'),
(2, '2002-02-22');

Then, one of the ways in which the query to get the desired output could be written is:

SELECT q.*,  
    CASE WHEN (
            IF(@id != q.id, @rank := 0, @rank := @rank + 1)
         ) >=1 THEN @rank
         ELSE @rank := 1 
    END as rank,
    @id := q.id AS buffer_id
FROM q43381823 q
CROSS JOIN (
    SELECT  @rank:= 0, 
        @id  := (SELECT q2.id FROM q43381823 AS q2 ORDER BY q2.id LIMIT 1)
    ) x
ORDER BY q.id, q.dt

Output:

  id  |    dt        |   rank  |    buffer_id 
-------------------------------------------------    
  1   |  2001-01-11  |    1    |       1    
  1   |  2002-02-22  |    2    |       1
  1   |  2003-03-03  |    3    |       1
  2   |  2001-01-11  |    1    |       2
  2   |  2002-02-22  |    2    |       2

You may please ignore the buffer_id column from the output - it's irrelevant to the result, but required for the resetting of rank.

SQL Fiddle Demo



Explanation:

  • @id variable keeps track of every id in the row, based on the sorted order of the output. In the initial iteration, we set it to id of the first record that may be obtained in the final result. See sub-query SELECT q2.id FROM q43381823 AS q2 ORDER BY q2.id LIMIT 1

  • @rank is set to 0 initially and is by default incremented for every subsequent row in the result set. However, when the id changes, we reset it back to 1. Please see the CASE - WHEN - ELSE construct in the query for this.

  • The final output is sorted first by id and then by dt. This ensures that @rank is set incrementally for every subsequent dt field within the same id, but gets reset to 1 whenever a new id group begins to show up in the result set.

Dhruv Saxena
  • 1,336
  • 2
  • 12
  • 29
  • 1
    I love you! Thank you so much. I never would have figured this out. And especially thanks for the excellent explanation. – Monica Heddneck Apr 13 '17 at 02:54
  • this is crazy complex, can you explain how the whole CASE WHEN works? (especially the if, it makes no sense to me and when i remove one of the "argument" from the if, it returns an error) – Barbz_YHOOL May 27 '20 at 01:38
  • 1
    @Barbz_YHOOL MySQL doesn't support **row rank**, so we use the above `CASE - WHEN` construct to mimic it. You can read it as follows: `WHEN (x >= 1) use @rank, ELSE (i.e. DEFAULT) intialise @rank to 1`. You'll notice that `x` is actually `IF(condition, true, false)`. That is, `if id of the current row (qid) is NOT the same as the id of the previous row (@id), reset "x" to 0, otherwise increment "x" by 1`. This resets `@rank` to `1` as soon as there's a change in `id`, else it keeps getting incremented. Pls note that when `IF` returns `x` as `0`, `CASE` instantly uses `DEFAULT` to set it to `1` – Dhruv Saxena May 27 '20 at 17:43
  • So the `ELSE @rank := 1` is because it must start at 1 and not at 0? I also didn't know about this `IF(condition, true, false)`, where can I read about it for mysql? I made it simpler for my use case ( https://hastebin.com/deguzavowe.coffeescript ) and it works perfectly but I had to make temporary tables and tons of subqueries to make it all work but nonetheless, your answer saved me. – Barbz_YHOOL Jun 04 '20 at 01:46
  • 1
    @Barbz_YHOOL Yes, you're right. In this particular case, it was required to enumerate the rows as 1, 2, 3.... Please see this [link](https://www.w3schools.com/sql/func_mysql_if.asp) to know more about about MySQL `IF()` function. If this post helps, then that's certainly good to hear. Thanks. However, if you think you might want to get an opinion about whether the code is actually efficient or not, you could consider posting a new question for your specific problem. Who knows, there may be a simpler way to get what you're after? – Dhruv Saxena Jun 04 '20 at 19:44
  • rarely got any success posting on stackoverflow, I usually get downvoted and ignored, anyway I now learned about IF and CASE WHEN and it opens many new possibilities to me. Thanks again! – Barbz_YHOOL Jun 06 '20 at 22:39
  • I made an error in one of the statements above. MySQL now **supports** row level ranking, v8.0 onwards. However, that wasn't the case at the time when this Q&A post was written. Please have a look at [MySQL Documentation](https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html#function_rank) and this [article](https://towardsdatascience.com/mysql-how-to-write-a-query-that-returns-the-top-records-in-a-group-12865695f436) – Dhruv Saxena Jun 09 '20 at 07:41