3

This is for SQL Server 2008/2012.

I have the following dataset with the claim start date and end date. I want to calculate the number of days when there are back to back claims where the claim start date of the next date is one day after the claim end date of the previous date, making it a continuous service.

If there is a break in service, like for member id 1002 where the claim end of 05/15 and next one starts on 05/18, the count should restart.

MemberID    Claim Start   Claim End      Claim_ID
1001        2016-04-01    2016-04-15     ABC11111
1001        2016-04-16    2016-04-30     ABC65465
1001        2016-05-01    2016-05-15     ABC51651
1001        2016-05-16    2016-06-15     ABC76320
1002        2016-04-01    2016-04-15     ABC74563
1002        2016-04-16    2016-04-30     ABC02123
1002        2016-05-01    2016-05-15     ABC02223
1002        2016-05-18    2016-06-15     ABC66632
1002        2016-06-16    2016-06-30     ABC77447
1002        2016-07-10    2016-07-31     ABC33221
1002        2016-08-01    2016-08-10     ABC88877

So effectively, I want the following output. Min of the very first claim start date, max of the claim end date when there is no gap in coverage between multiple claims. If there is a gap in coverage, the count starts over and the min of the start date of the 1st claim and the max of the claim end date until there is no gap in coverage between multiple claims.

MemberID    Claim_Start   Claim_End     Continuous_Service_Days
1001        2016-04-01    2016-06-15    76
1002        2016-04-01    2016-05-15    45
1002        2016-05-18    2016-06-30    44 
1002        2016-07-10    2016-08-10    32

I have tried while loops, CTE's and I have also tried the following table to first get all the dates between the claims. But I am having problems with counting the days between consecutive dates and to reset the count if there is a break in coverage.

Master.dbo.spt_values

Any help is appreciated. Thanks!

Sanket J
  • 187
  • 1
  • 1
  • 12
  • I think you should first detect and mark the rows with gaps. – McNets Nov 14 '16 at 21:36
  • I attempted to answer and then saw this is a gaps question. Not so simple. Check out [this page about islands and gaps problems and solutions in sql server](https://www.simple-talk.com/sql/t-sql-programming/the-sql-of-gaps-and-islands-in-sequences/). It's very helpful. Window Functions are our friends with this kind of an issue. – JNevill Nov 14 '16 at 21:38

2 Answers2

2

You need to find the gaps first.

This solution uses a Tally Table to generate the dates first from ClaimStart to ClaimEnd. Then using the generated dates, get the gaps using this method.

Now that you have the gaps, you can now use GROUP BY to ge the MIN(ClaimStart) and MAX(ClaimStart):

WITH E1(N) AS( -- 10 ^ 1 = 10 rows
    SELECT 1 FROM(VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))t(N)
),
E2(N) AS(SELECT 1 FROM E1 a CROSS JOIN E1 b), -- 10 ^ 2 = 100 rows
E4(N) AS(SELECT 1 FROM E2 a CROSS JOIN E2 b), -- 10 ^ 4 = 10,000 rows
CteTally(N) AS(
    SELECT TOP(SELECT MAX(DATEDIFF(DAY, ClaimStart, ClaimEnd) + 1) FROM tbl) 
        ROW_NUMBER() OVER(ORDER BY(SELECT NULL))
    FROM E4
),
CteDates AS( -- Generate the dates from ClaimStart to ClaimEnd
    SELECT
        t.MemberID,
        dt = DATEADD(DAY, ct.N - 1, t.ClaimStart)
    FROM tbl t
    INNER JOIN CteTally ct
        ON DATEADD(DAY, ct.N - 1, t.ClaimStart) <= t.ClaimEnd
),
CteGrp AS( -- Find gaps and continuous dates
    SELECT *,
        rn = DATEADD(DAY, - ROW_NUMBER() OVER(PARTITION BY MemberID ORDER BY dt), dt)
    FROM CteDates
)
SELECT
    MemberID,
    ClaimStart  = MIN(dt),
    ClaimEnd    = MAX(dt),
    Diff        = DATEDIFF(DAY, MIN(dt), MAX(dt)) + 1
FROM CteGrp
GROUP BY MemberID, rn
ORDER BY MemberID, ClaimStart;

ONLINE DEMO

Community
  • 1
  • 1
Felix Pamittan
  • 31,544
  • 7
  • 41
  • 67
  • This is just amazing. Works perfectly. However my next step is to understand what is actually going on. I am still in process of decrypting this wonderful code in my mind. Thanks for the help!! – Sanket J Nov 14 '16 at 22:14
  • @SanketJ Read both articles linked in my answer. Both are good read. – Felix Pamittan Nov 14 '16 at 22:15
1
Declare @YourTable table (MemberID int,[Claim Start] date,[Claim End] date,[Claim_ID] varchar(25))
Insert Into @YourTable values
(1001,'2016-04-01','2016-04-15','ABC11111'),
(1001,'2016-04-16','2016-04-30','ABC65465'),
(1001,'2016-05-01','2016-05-15','ABC51651'),
(1001,'2016-05-16','2016-06-15','ABC76320'),
(1002,'2016-04-01','2016-04-15','ABC74563'),
(1002,'2016-04-16','2016-04-30','ABC02123'),
(1002,'2016-05-01','2016-05-15','ABC02223'),
(1002,'2016-05-18','2016-06-15','ABC66632'),
(1002,'2016-06-16','2016-06-30','ABC77447'),
(1002,'2016-07-10','2016-07-31','ABC33221'),
(1002,'2016-08-01','2016-08-10','ABC88877')


;with cte0(N)   as (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N))
     ,cte1(R,D) as (Select Row_Number() over (Order By (Select Null))
                          ,DateAdd(DD,-1+Row_Number() over (Order By (Select Null)),(Select MinDate=min([Claim Start]) From  @YourTable)) 
                     From  cte0 N1, cte0 N2, cte0 N3, cte0 N4) 
Select MemberID
      ,[Claim Start] = Min([Claim Start])
      ,[Claim End]   = Max([Claim End])
      ,Continuous_Service_Days = count(*)
 From (
         Select *,Island = R - Row_Number() over (Partition By MemberID Order by [Claim Start])
          From  @YourTable A
          Join  cte1   B on D Between [Claim Start] and [Claim End]
      ) A
 Group By MemberID,Island
 Order By 1,2

Returns

MemberID    Claim Start Claim End   Continuous_Service_Days
1001        2016-04-01  2016-06-15  76
1002        2016-04-01  2016-05-15  45
1002        2016-05-18  2016-06-30  44
1002        2016-07-10  2016-08-10  32
John Cappelletti
  • 79,615
  • 7
  • 44
  • 66
  • Slightly different approach but this one works perfectly as well. Can you please provide some articles as well which I can refer to understand how you came about those CTE's? – Sanket J Nov 14 '16 at 22:21
  • @SanketJ cteMinMax just grabs the min/max dates required... not necessary if you had a fixed start date. cte0 is a dummy table of 10 ints. cteDates expands cte0 to 100,000 rows which are then converted into dates. We now have a complete list of sequential dates with a Row_Number. The Grp takes this RN - your table row now which will create a group sequence. There where apply the final aggregations – John Cappelletti Nov 14 '16 at 23:19
  • @SanketJ In-short, my best advice is run the incremental CTEs to see how each relates to the previous, and then run the sub-query in the final select to better visualize the process – John Cappelletti Nov 14 '16 at 23:26