1

I have a table for which I would like to select the most recent time stamp in a category defined by the value of a particular column in the table.

Specifically

SELECT * 
FROM   takelist 
WHERE  producer_name = 'sean' 
       AND bucket_id = '2CCEx15_1o' 

results in

+-------------+---------------+------------+---------------------+
| takelist_id | producer_name | bucket_id  | ts                  |
+-------------+---------------+------------+---------------------+
|           1 | sean          | 2CCEx15_1o | 2013-10-07 18:29:00 |
|           4 | sean          | 2CCEx15_1o | 2013-10-07 18:33:09 |
|           5 | sean          | 2CCEx15_1o | 2013-10-07 18:33:38 |
|          27 | sean          | 2CCEx15_1o | 2013-10-07 18:37:38 |
|         212 | sean          | 2CCEx15_1o | 2013-10-14 18:36:05 |
|         236 | sean          | 2CCEx15_1o | 2013-10-21 17:59:56 |
|         237 | sean          | 2CCEx15_1o | 2013-10-21 18:00:55 |
|         281 | sean          | 2CCEx15_1o | 2013-10-29 15:58:40 |
|         287 | sean          | 2CCEx15_1o | 2013-10-29 19:24:15 |
|         330 | sean          | 2CCEx15_1o | 2013-10-31 14:39:33 |
|         615 | sean          | 2CCEx15_1o | 2013-12-16 22:46:59 |
|         616 | sean          | 2CCEx15_1o | 2013-12-16 22:54:46 |
+-------------+---------------+------------+---------------------+

I would like to select one row for each unique value of the column named bucket_id where the selected row has the most recent timestamp.

I have tried the below based upon previous answers to similar question but must have got something wrong

SELECT takes.* FROM takelist as takes 
INNER JOIN (         
            SELECT takelist_id, max(ts) max_ts, bucket_id
            FROM takelist 
            WHERE producer_name='sean' 
            GROUP BY bucket_id
             ) latest_take
ON takes.takelist_id=latest_take.takelist_id
AND takes.ts=latest_take.max_ts 
Hawk
  • 5,060
  • 12
  • 49
  • 74
user3145007
  • 11
  • 1
  • 2
  • Your subquery is relying on MySQL-specific behavior, and your statement will cause most/all other RDBMSs to throw a syntax error: you list a column that is neither in the `GROUP BY` nor an aggregate function (like `MAX()`). In MySQL, the contents of `takelist_id` in this case are essentially "undetermined" - it depends heavily on what indices you have defined, what queries are executing, etc, and is completely up to the whim of the optimizer. There's not a good way to guarantee specific behavior in this case, so try to avoid using this "feature". – Clockwork-Muse Dec 30 '13 at 09:07

2 Answers2

5

Your query is close. But you are using the id instead of the timestamp:

SELECT takes.*
FROM takelist takes INNER JOIN
     (SELECT max(ts) as max_ts, bucket_id
      FROM takelist 
      WHERE producer_name = 'sean' 
      GROUP BY bucket_id
     ) latest_take
     ON takes.ts = latest_take.max_ts and takes.bucket_id = latest_take.bucket_id;

An arbitrary takelist_id is chosen in the original formulation. And it might not be the one you want.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • 1
    @user3145007 What will be the case if `bucket_id` changes? As per me for all bucket_ids this query will check only TIMESTAMP with takelist table so it is possible that query returns more than one record per bucket_ids. – Saharsh Shah Dec 30 '13 at 09:04
  • Gordon - I think you meant to add `bucket_id` to the join? – Clockwork-Muse Dec 30 '13 at 09:08
0

Try this:

SELECT t.* 
FROM takelist AS t 
INNER JOIN (SELECT MAX(ts) max_ts, bucket_id
            FROM takelist WHERE producer_name='sean' 
            GROUP BY bucket_id
           ) lt ON t.bucket_id=lt.bucket_id AND t.ts=lt.max_ts;

OR

SELECT * 
FROM (SELECT * FROM takelist WHERE producer_name='sean' ORDER BY bucket_id, ts DESC) A 
GROUP BY bucket_id
Saharsh Shah
  • 28,687
  • 8
  • 48
  • 83