Select distinct values from a column based on highest value in another column

Question

I have a table in which I need to return a distinct value for each id based on the highest reference value. As an example, this table:

my table
|--------|----------|-------------|
| ID     | Amount   | Ref_value   |
|---------------------------------|
| 1      | 200      | 5           |
| 1      | 120      | 8           |
| 2      | 150      | 3           |
| 3      | 200      | 4           |
|--------|----------|-------------|

I need to get one return per ID, but since ID=1 appears more than once, I need to select the one with the highest "Ref_value". So my result would be:

result of the query over my table
|--------|----------|-------------|
| ID     | Amount   | Ref_value   |
|---------------------------------|
| 1      | 120      | 8           |
| 2      | 150      | 3           |
| 3      | 200      | 4           |
|--------|----------|-------------|

What if you had another row with ID = 1, Ref_value = 8 and a different amount? Could that happen? If so, would you want 1 row or 2 for ID = 1? — GermanC, Jan 15 '18 at 19:46
very good question. I believe in my specific case the ref_values will never be the same, since in the example I have put an int, but in my real table it is a timestamp. But in such case they could be repeated, I'd say whichever would work. — JD Gamboa, Jan 15 '18 at 19:49
Then the answers already work for you! You have one for each of the two approaches. — GermanC, Jan 15 '18 at 19:56

score 2 · Accepted Answer · answered Jan 15 '18 at 19:46

2

you could use a inner join on the max value grouped by id

select * from my_table 
inner join (
   select id, max(ref_value) max_value
   from my_table 
   group by id
)  t on t.id = my_table_id, t.max_vale = my_table_value

answered Jan 15 '18 at 19:46

ScaisEdge

131,976
10
91
107

score 2 · Answer 2 · answered Jan 15 '18 at 19:54

If your database supports window functions, this solution will access the table only once:

SELECT id, amount, ref_value
FROM (
  SELECT t.*, row_number() OVER (PARTITION BY id ORDER BY ref_value DESC) rn
  FROM t
) t
WHERE rn = 1

If you're using Oracle, you might use this solution that I've found to be a bit faster in most cases:

SELECT
  max(id)        KEEP (DENSE_RANK FIRST ORDER BY ref_value DESC) id,
  max(amount)    KEEP (DENSE_RANK FIRST ORDER BY ref_value DESC) amount,
  max(ref_value) KEEP (DENSE_RANK FIRST ORDER BY ref_value DESC) ref_value,
FROM t
GROUP BY id

Both of the above solutions will return an arbitrary row if two ref_value values are tied.

If none of the above apply, scaisEdge's solution works on all databases.

score 1 · Answer 3 · answered Jan 15 '18 at 19:57

Select * might throw an ambiguous column names error. So you might want to select the only required columns there.

SELECT A.ID, A.AMOUNT, A.REF_VALUE
FROM
MYTABLE A
INNER JOIN
(SELECT ID, MAX(REF_VALUE) AS MAX_REF FROM MYTABLE GROUP BY ID) B
ON A.ID = B.ID AND A.REF_VALUE = B.MAX_REF;

Select distinct values from a column based on highest value in another column

3 Answers3