1

I have a table from which I would like to retrieve only the most recent record within each group of records, as marked by the value in a particular field.

The table content looks something like this:

state   date
512     2021-03-09 11:31:38.300
512     2021-03-09 11:31:38.300
512     2021-03-09 11:31:31.693
512     2021-03-09 11:31:31.693
512     2021-03-08 12:49:10.753
512     2021-03-08 12:35:47.357
514     2021-03-08 12:35:01.030
512     2021-03-08 12:33:48.050
514     2021-03-08 12:14:29.537
514     2021-03-08 12:14:29.537
514     2021-03-08 12:14:18.760
512     2021-03-08 12:14:05.597

I would like to use OVER and PARTITION to SELECT output like this:

row state   date
1   512     2021-03-09 11:31:38.300
2   512     2021-03-09 11:31:38.300
3   512     2021-03-09 11:31:31.693
4   512     2021-03-09 11:31:31.693
5   512     2021-03-08 12:49:10.753
6   512     2021-03-08 12:35:47.357
1   514     2021-03-08 12:35:01.030
1   512     2021-03-08 12:33:48.050
1   514     2021-03-08 12:14:29.537
2   514     2021-03-08 12:14:29.537
3   514     2021-03-08 12:14:18.760
1   512     2021-03-08 12:14:05.597

As you can see, the rows are ordered by date DESC and the state field is grouped by virtue of the row field starting at 1 for each change in the state field.

Currently, my code looks like this:

with query as
(
    select state, date, row = row_number() over (partition by state order by date desc)
    from table
)
select t.*
from table t
inner join query q on t.state = q.state and t.date = q.date
where row = 1
order by t.date desc

Unfortunately, this appears to group the records by state before ordering them by date DESC, so the result is only two result set records because there are only two different values in the state field. There should (for the example data above) be 5 resultset records.

How can I number the partition groups properly?

Matt W
  • 11,753
  • 25
  • 118
  • 215
  • *"this appears to group the records by state before"* Well yes, that's what you asked it to do. **`partition by state`**. There are only 2 different states above, so only 2 rows can be the "first" row for those states. – Thom A Oct 18 '21 at 15:39
  • Yes, I realise that. My question is asking how to group (partition) the rows as in the second example block. Your comment has only stated what I've pointed out in my problem. – Matt W Oct 18 '21 at 15:42
  • So what are your expected results here? As mentioned, the query is working correctly for what you've written. There aren't 5 states, so why are you expecting 5 rows? – Thom A Oct 18 '21 at 15:44
  • Looking at the second table of data (with the row column) I would like the output to number the rows. I would like each group of a particular value in the `state` field to begin the row numbering again. As it shows, reading down the row and state fields, any time the `state` field changes from the row above, the `row` numbering restarts at `1`. – Matt W Oct 18 '21 at 15:46
  • 1
    Ahh, a classic gaps and island problem. I've added a couple dupe targets. – Thom A Oct 18 '21 at 15:48
  • Many thanks @Larnu :) – Matt W Oct 18 '21 at 16:05

1 Answers1

1

This works, though there may be a simpler way to do it.

WITH
cte1 AS
(
    SELECT 
        ROW_NUMBER() OVER(ORDER BY date_ DESC) as dateRow,
        ROW_NUMBER() OVER(PARTITION BY state ORDER BY date_ DESC) as stateRow,
        state,
        date_
    FROM    StateDate
)
SELECT 
    ROW_NUMBER() OVER(PARTITION BY state, (dateRow - stateRow)  ORDER BY date_ DESC) as row,
    state, date_
FROM cte1
ORDER BY date_ DESC, state

Here's the data setup I used:

CREATE TABLE StateDate( state INT, date_ DATETIME)
GO

--state   date
INSERT INTO StateDate VALUES (512, '2021-03-09 11:31:38.300');
INSERT INTO StateDate VALUES (512, '2021-03-09 11:31:38.300');
INSERT INTO StateDate VALUES (512, '2021-03-09 11:31:31.693');
INSERT INTO StateDate VALUES (512, '2021-03-09 11:31:31.693');
INSERT INTO StateDate VALUES (512, '2021-03-08 12:49:10.753');
INSERT INTO StateDate VALUES (512, '2021-03-08 12:35:47.357');
INSERT INTO StateDate VALUES (514, '2021-03-08 12:35:01.030');
INSERT INTO StateDate VALUES (512, '2021-03-08 12:33:48.050');
INSERT INTO StateDate VALUES (514, '2021-03-08 12:14:29.537');
INSERT INTO StateDate VALUES (514, '2021-03-08 12:14:29.537');
INSERT INTO StateDate VALUES (514, '2021-03-08 12:14:18.760');
INSERT INTO StateDate VALUES (512, '2021-03-08 12:14:05.597');
GO
RBarryYoung
  • 55,398
  • 14
  • 96
  • 137