Rank based on cumulative value

Question

I want to rank on ID and value columns based on ascending order of UID. Expected output has to change once value column has a different value than the previous value. Ranks has to restart on each new ID

UID ID  Value Expected Output
1   1   0         1
2   1   0         1
3   1   1         2
4   1   1         2
5   1   1         2
6   1   0         3
7   1   1         4
8   1   0         5
9   1   0         5
10  1   0         5
11  2   1         1
12  2   1         1
13  2   0         2
14  2   0         2
15  2   1         3

Here is a sample dataset that I have created:

CREATE TABLE [dbo].[Data] (
    [UID] [int] NOT NULL,
    [ID] [int] NULL,
    [Value] [int] NULL
);

INSERT [dbo].[Data] ([UID], [ID], [Value]) VALUES (1, 1, 0);
INSERT [dbo].[Data] ([UID], [ID], [Value]) VALUES (2, 1, 0);
INSERT [dbo].[Data] ([UID], [ID], [Value]) VALUES (3, 1, 1);
INSERT [dbo].[Data] ([UID], [ID], [Value]) VALUES (4, 1, 1);
INSERT [dbo].[Data] ([UID], [ID], [Value]) VALUES (5, 1, 1);
INSERT [dbo].[Data] ([UID], [ID], [Value]) VALUES (6, 1, 0);
INSERT [dbo].[Data] ([UID], [ID], [Value]) VALUES (7, 1, 1);
INSERT [dbo].[Data] ([UID], [ID], [Value]) VALUES (8, 1, 0);
INSERT [dbo].[Data] ([UID], [ID], [Value]) VALUES (9, 1, 0);
INSERT [dbo].[Data] ([UID], [ID], [Value]) VALUES (10, 1, 0);
INSERT [dbo].[Data] ([UID], [ID], [Value]) VALUES (11, 2, 1);
INSERT [dbo].[Data] ([UID], [ID], [Value]) VALUES (12, 2, 1);
INSERT [dbo].[Data] ([UID], [ID], [Value]) VALUES (13, 2, 0);
INSERT [dbo].[Data] ([UID], [ID], [Value]) VALUES (14, 2, 0);
INSERT [dbo].[Data] ([UID], [ID], [Value]) VALUES (15, 2, 1);

score 5 · Accepted Answer · answered Sep 17 '20 at 17:02

I think that the simplest approach to this gaps-and-islands problem is to use lag() to retrieve the "previous" value, and then a window sum that increments everytime the value changes.

select uid, id, value,
    1 + sum(case when value <> lag_value then 1 else 0 end) 
        over(partition by id order by uid) grp
from (
    select d.*, lag(value, 1, value) over(partition by id order by uid) lag_value
    from data d
) d
order by uid

Demo on DB Fiddle:

uid | id | value | grp
--: | -: | ----: | --:
  1 |  1 |     0 |   1
  2 |  1 |     0 |   1
  3 |  1 |     1 |   2
  4 |  1 |     1 |   2
  5 |  1 |     1 |   2
  6 |  1 |     0 |   3
  7 |  1 |     1 |   4
  8 |  1 |     0 |   5
  9 |  1 |     0 |   5
 10 |  1 |     0 |   5
 11 |  2 |     1 |   1
 12 |  2 |     1 |   1
 13 |  2 |     0 |   2
 14 |  2 |     0 |   2
 15 |  2 |     1 |   3

I hate choosing between two working solutions between GMB and Tim. However, this is pretty much a direct translation of the problem statement: it counts the number of times the value switches. — Gordon Linoff, Sep 17 '20 at 18:15

score 2 · Answer 2 · answered Sep 17 '20 at 17:09

This is a gaps and islands problem. I think that the simplest approach is to use the difference in row numbers method:

WITH cte AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY UID) rn1,
        ROW_NUMBER() OVER (PARTITION BY ID, [Value] ORDER BY UID) rn2
    FROM Data
)

SELECT *, DENSE_RANK() OVER (PARTITION BY ID ORDER BY rn1 - rn2, [Value]) AS output
FROM cte
ORDER BY UID;

Rank based on cumulative value

2 Answers2

Demo