I have the following scenario where I'm struggling to understand how to apply DENSE_RANK()
to get the result I want:
ID | Date | Value |
---|---|---|
1 | 1990-05-17 | 1.00 |
1 | 1991-10-12 | 1.00 |
1 | 1992-08-01 | 1.00 |
1 | 1993-07-05 | 0.67 |
1 | 1994-05-02 | 0.67 |
1 | 1995-02-01 | 1.00 |
1 | 1996-03-01 | 1.00 |
Based on the above data, I'm trying to identify distinct periods using the combination of the Date
and Value
columns, where a unique period is identified from where the Value
column changes from one value to another. Here's the result I'm looking for:
ID | Date | Value | Period |
---|---|---|---|
1 | 1990-05-17 | 1.00 | 1 |
1 | 1991-10-12 | 1.00 | 1 |
1 | 1992-08-01 | 1.00 | 1 |
1 | 1993-07-05 | 0.67 | 2 |
1 | 1994-05-02 | 0.67 | 2 |
1 | 1995-02-01 | 1.00 | 3 |
1 | 1996-03-01 | 1.00 | 3 |
As you can see, there are 3 distinct periods. The problem I am having is that when I use DENSE_RANK()
, I get one of two outcomes:
SELECT DENSE_RANK() OVER (PARTITION BY ID ORDER BY Date, Value)
ID | Date | Value | Period |
---|---|---|---|
1 | 1990-05-17 | 1.00 | 1 |
1 | 1991-10-12 | 1.00 | 2 |
1 | 1992-08-01 | 1.00 | 3 |
1 | 1993-07-05 | 0.67 | 4 |
1 | 1994-05-02 | 0.67 | 5 |
1 | 1995-02-01 | 1.00 | 6 |
1 | 1996-03-01 | 1.00 | 7 |
SELECT DENSE_RANK() OVER (PARTITION BY ID ORDER BY Value)
ID | Date | Value | Period |
---|---|---|---|
1 | 1990-05-17 | 1.00 | 1 |
1 | 1991-10-12 | 1.00 | 1 |
1 | 1992-08-01 | 1.00 | 1 |
1 | 1993-07-05 | 0.67 | 2 |
1 | 1994-05-02 | 0.67 | 2 |
1 | 1995-02-01 | 1.00 | 1 |
1 | 1996-03-01 | 1.00 | 1 |
As you can see, the problem lies with the Date
column as I need that to be a cumulative period. Furthermore, the amount of periods will vary from ID
to ID
and there's no consistent science behind the Date
column. A member could have two entries in one year for example.