3

I have been using the DISTINCT ON predicate and have decided to replace it with GROUP BY, mainly because it "is not part of the SQL standard and is sometimes considered bad style because of the potentially indeterminate nature of its results".

I am using DISTINCT ON in conjunction with ORDER BY in order to select the latest records in a history table, but it's not clear to me how to do the same with the GROUP BY. What could be a general approach in order to move from one construct to the other one?

An example could be

SELECT
  DISTINCT ON (f1, f2 ) *
  FROM table
  ORDER BY f1, f2, datefield DESC;

where I get the "latest" pairs of (f1,f2).

EnzoR
  • 3,107
  • 2
  • 22
  • 25
  • . . Please show an example of your query. – Gordon Linoff Jan 29 '15 at 13:43
  • Chapter 7.3.3 discourages the use of DISTINCT ON. As soon as I'll found that email I'll quote it here about the planner and the "distinct on" optimizations. – EnzoR Jan 29 '15 at 13:48
  • reworded my question – EnzoR Jan 30 '15 at 10:14
  • "*the potentially indeterminate nature of its results*" - only if you forget the `order by` or specify a sort order that is not stable (e.g. where duplicate values can occur in the sort column) - but that problem applies to the solution using a window function as well. –  Jan 30 '15 at 10:46

1 Answers1

3

If you have a query like this:

select distinct on (col1) t.*
from table t
order by col1, col2

Then you would replace this with window functions, not a group by:

select t.*
from (select t.*,
             row_number() over (partition by col1 order by col2) as seqnum
      from table t
     ) t
where seqnum = 1;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786