0

The set of data that I have has gap between dates and with Lag() function I calculated the number of months that gap exists. Then in second query with dense_rank() I count active Months for each ID. However, as you see in the table the count resets where there is a gap but after,it continues with the previous count instead of the reset number.

Could you please help?

SELECT
    GP, ID, Date, 
    DENSE_RANK() OVER (PARTITION BY ENR.GP, ENR.ID, Age, ENR.DATEDIFF 
                       ORDER BY ENR.DATE ASC) AS Active_MOS
FROM
    (SELECT 
         GP, ID, 
         MONTHS_BETWEEN (DATE, LAG(DATE, 1) OVER (PARTITION BY GP, ID, CASE WHEN GNDR = 'M' AND AGE < 35 THEN 'YES' ELSE 'NO' END ORDER BY YRMO ASC )) AS DateDiff
     FROM 
         TABLE2)     

enter image description here

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
SamR
  • 517
  • 3
  • 10
  • 24

1 Answers1

1

You have to give a number to every island and put it in the partition of dense_rank. If you set a value of 1 when a row doesn't follow the preceding one then the cumulative sum of that value can be used as the number of the island :

with
--table2 (gp, id, age, date) as (
--  values
--  (33, 1375, 1, date '2021-05-01'),
--  (33, 1375, 2, date '2021-06-01'),
--  (33, 1375, 2, date '2021-07-01'),
--  (33, 1375, 2, date '2021-08-01'),
--  (33, 1375, 2, date '2021-09-01'),
--  (33, 1375, 2, date '2021-10-01'),
--  (33, 1375, 2, date '2021-11-01'),
--  (33, 1375, 2, date '2021-12-01'),
--  (33, 1375, 2, date '2022-01-01'),
--  (33, 1375, 2, date '2022-02-01'),
--  (33, 1375, 2, date '2022-03-01'),
--  (33, 1375, 2, date '2022-04-01'),
--  (33, 1375, 2, date '2022-05-01'),
--  (33, 1375, 2, date '2022-06-01'),
--  (33, 1375, 2, date '2022-07-01'),
--  (33, 1375, 2, date '2022-08-01'),
--  (33, 1375, 2, date '2022-09-01'),
--  (33, 1375, 2, date '2023-05-01'),
--  (33, 1375, 2, date '2023-06-01'),
--  (33, 1375, 2, date '2023-07-01')
--),
islands as (
  select
    t2.*,
    sum(
      case when t2."DATE" - 1 month
                > lag(t2.date, 1) over(partition by gp, id, age order by date)
           then 1 else 0 end
    ) over(partition by gp, id, age order by date) island_no
  from table2 t2
)
select
  islands.*,
  dense_rank() over(partition by gp, id, age, island_no order by date) rank
from islands
nfgl
  • 2,812
  • 6
  • 16