1

NOTE: I WANT TO AVOID DISTINCT ON FOR PERFORMANCE REASONS.

NOTE 2: I WAS WRONG. USING PROPER INDEXES THE QUERY WORKED AWESOME THANKS TO @Gordon Linoff!

Having the following structure:

| id | image_url     | sort | t1_id |
|----|---------------|------|-------|
| 1  | https://.../1 | 10   | 1     |
| 2  | https://.../2 | 20   | 1     |
| 3  | https://.../3 | 30   | 1     |
| 4  | https://.../4 | 30   | 2     |
| 5  | https://.../5 | 20   | 2     |
| 6  | https://.../6 | 10   | 2     |

I want to fetch the lowest sort row's image_url column by t1_id, similar to the following:

SELECT * FROM t2 WHERE MIN(sort) GROUP BY (t1_id);

Getting the following result:

| id | image_url     | sort | t1_id |
|----|---------------|------|-------|
| 1  | https://.../1 | 10   | 1     |
| 6  | https://.../6 | 10   | 2     |

Thanks in advance!

Zeswen
  • 261
  • 4
  • 12
  • 1
    I am pretty sure that `DISTINCT ON` is not slower than any other technique to achieve the same (e.g. with `ROW_NUMBER` or a sbquery). So, your problem is more general and you may just want to provide your DBMS with an appropriate index. – Thorsten Kettner Mar 08 '21 at 12:55

1 Answers1

2

Postgres has a handy extension called distinct on:

select distinct on (t1_id) t2.*
from t2
order by t1_id, sort asc;

This is usually the fastest way to approach such a problem. In particular, this can take advantage of an index on (t1_id, sort [desc]).

However, you can try another approach such as:

select t2.*
from t2
where t2.sort = (select min(tt2.sort)
                 from t2 tt2
                 where tt2.t1_id = t2.t1_id
                );

This would use the same index. If this is faster, please post a comment with the relevant performance.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • As I noted on the first line, the ms from my query goes from 200ms to 700ms with distinct on. – Zeswen Mar 08 '21 at 12:55
  • 1
    @Zeswen . . . The query has to do more work, so it is not surprising that it takes more time. `distinct on` is often the fastest method -- particularly with the right index. – Gordon Linoff Mar 08 '21 at 12:56
  • I did not notice the index you mentioned with `(t1_id, sort [asc])`. It really did the trick with performance, thank you very much! Learned something new with the `asc` in the index :) – Zeswen Mar 08 '21 at 13:02