1

Suppose I have the following table:

+----+-------------+-------------+
| id | step_number | employee_id |
+----+-------------+-------------+
|  1 |           1 |           3 |
|  1 |           2 |           3 |
|  1 |           3 |           4 |
|  2 |           2 |           3 |
|  2 |           3 |           4 |
|  2 |           4 |           5 |
+----+-------------+-------------+

My desired results are:

+----+-------------+-------------+
| id | step_number | employee_id |
+----+-------------+-------------+
|  1 |           1 |           3 |
|  2 |           2 |           3 |
+----+-------------+-------------+

My current solution is:

SELECT
    *
FROM
(SELECT
    id,
    step_number,
    MIN(step_number) OVER (PARTITION BY id) AS min_step_number,
    employee_id
FROM
    table_name) AS t
WHERE
    t.step_number = t.min_step_number

Is there a more efficient way I could be doing this?

I'm currently using postgresql, version 12.

GMB
  • 216,147
  • 25
  • 84
  • 135
Jordan
  • 868
  • 1
  • 8
  • 25

1 Answers1

4

In Postgres, I would recommend using distinct on to adress this greatest-n-per-group problem:

select distinct on (id) t.*
from mytbale t
order by id, step_number

This Postgres extension to the SQL standard has usually better performance than the standard approach using window functions (and, as a bonus, the syntax is neater).

Note that this assumes unicity of (id, step_number) tuples: otherwise, the results might be different than those of your query (which allows ties, while distinct on does not).

GMB
  • 216,147
  • 25
  • 84
  • 135