How do I get multiple maximum column values for multiple rows with the same id?

Question

I need to select the maximum amounts in one column for a common id in another column. There could be several id's in the report_id column that have the same, maximum last_update amounts.

Data structure:

+------+-------+--------------------------------------------+
| id   | report_id   | last_update                          |
+------+-------------+--------------------------------------+
| 1    | 1           | 2019-01-24                           |
| 2    | 1           | 2019-01-24                           |
| 3    | 1           | 2019-01-24                           |
| 4    | 2           | 2019-01-24                           |
| 5    | 3           | 2019-01-23                           |
+------+-------+--------------------------------------------+

The problem I am having so far is I can't seem to isolate my results simply by the report_id. For example, with the following query:

"SELECT report_id, last_update
 FROM reports
 WHERE last_update=(
               SELECT MAX(last_update) FROM reports 
               WHERE report_id='1'
               );  
            ";

This returns:

+------+-------+--------------------------------------------+
| id   | report_id   | last_update                          |
+------+-------------+--------------------------------------+
| 1    | 1           | 2019-01-24                           |
| 2    | 1           | 2019-01-24                           |
| 3    | 1           | 2019-01-24                           |
| 4    | 2           | 2019-01-24                           |
+------+-------+--------------------------------------------+

So it is nearly correct, but it also is including report_id 2 because it also has the MAX value of 2019-01-24 in last_update.

What I really need to do is select all columns with report_id as 1, and then select only the rows from that result set with MAX(last_update) but I have been looking at every greatest-nth-per-group and associated question on SO and I just can't get this one.

Anytime I bring MAX into the query it seems to negate the fact I am trying to isolate by report_id as well.

score 1 · Accepted Answer · answered Jan 24 '19 at 22:10

Here are a few solutions:

Tuple comparison:

SELECT report_id, last_update
 FROM reports
 WHERE (report_id, last_update) = (
               SELECT report_id, MAX(last_update) FROM reports 
               WHERE report_id='1'
               GROUP BY report_id
               );

Tuple comparison with a derived table instead of a dependent subquery:

SELECT report_id, last_update
 FROM reports
 INNER JOIN (
   SELECT report_id, MAX(last_update) AS last_update
   FROM reports WHERE report_id='1' GROUP BY report_id
 ) USING (report_id, last_update);

No-subquery solution, using exclusion join to find the reports for which no other report has the same report_id and a greater update date:

SELECT r1.*
FROM reports AS r1
LEFT OUTER JOIN reports AS r2
  ON r1.report_id=r2.report_id AND r1.last_update<r2.last_update
WHERE r2.report_id IS NULL;

MySQL 8.0 solution with windowing functions:

WITH ranked_reports AS (
  SELECT r.*, DENSE_RANK() OVER (PARTITION BY report_id ORDER BY last_update DESC) AS dr
  FROM reports WHERE report_id='1'
)
SELECT * FROM ranked_reports WHERE dr=1;

Extremely thorough and all working, thanks. Would the tuple comparison with a derived table be the best option in terms of speed rather than the sub query options? — absentx, Jan 24 '19 at 22:34
Depends on your version of MySQL (index optimization for tuple comparisons was implemented MySQL 5.7: https://dev.mysql.com/doc/refman/5.7/en/row-constructor-optimization.html). And probably depends on your data. I suggest you test the different solutions on your system. — Bill Karwin, Jan 24 '19 at 23:04

How do I get multiple maximum column values for multiple rows with the same id?

1 Answers1