0

Objective: Identify distinct episodes of continuous treatment for each member in a table. Each member has a diagnosis and a service date, and an episode is defined as all services where the time between each consecutive service is less than some number (let's say 90 days for this example). The query will need to loop through each row and calculate the difference between dates, and return the first and last date associated with each episode. The goal is to group results by member and episode start/end date.

A very similar question has been asked before, and was somewhat helpful. The problem is that in customizing the code, the returned tables are excluding first and last records. I'm not sure how to proceed.

My data currently looks like this:

MemberCode       Diagnosis              ServiceDate         
1001   -----        ABC      -----       2010-02-04           
1001   -----        ABC      -----       2010-03-20          
1001   -----        ABC      -----       2010-04-18          
1001   -----        ABC      -----       2010-05-22         
1001   -----        ABC      -----       2010-09-26          
1001   -----        ABC      -----       2010-10-11  
1001   -----        ABC      -----       2010-10-19
2002   -----        XYZ      -----       2010-07-10          
2002   -----        XYZ      -----       2010-07-21
2002   -----        XYZ      -----       2010-11-08
2002   -----        ABC      -----       2010-06-03           
2002   -----        ABC      -----       2010-08-13         

In the above data, the first record for Member 1001 is 2010-02-04, and there is not a difference of more than 90 days between consecutive services until 2010-09-26 (the date at which a new episode starts). So Member 1001 has two distinct episodes: (1) Diagnosis ABC, which goes from 2010-02-04 to 2010-05-22, and (2) Diagnosis ABC, which goes from 2010-09-26 to 2010-10-19.

Similarly, Member 2002 has three distinct episodes: (1) Diagnosis XYZ, which goes from 2010-07-10 to 2010-07-21, (2) Diagnosis XYZ, which begins and ends on 2010-11-08, and (3) Diagnosis ABC, which goes from 2010-06-03 to 2010-08-13.

Desired output:

MemberCode         Diagnosis       EpisodeStartDate          EpisodeEndDate
1001   -----          ABC   -----     2010-02-04   -----       2010-05-22
1001   -----          ABC   -----     2010-09-26   -----       2010-10-19
2002   -----          XYZ   -----     2010-07-10   -----       2010-07-21
2002   -----          XYZ   -----     2010-11-08   -----       2010-11-08
2002   -----          ABC   -----     2010-06-03   -----       2010-08-13

I've been working on this query for too long, and still can't get exactly what I need. Any help would be appreciated. Thanks in advance!

Community
  • 1
  • 1
Uncle Milton
  • 163
  • 1
  • 3
  • 10

2 Answers2

1

SQL Server 2012 has the lag() and cumulative sum functions, which makes it easier to write such a query. The idea is to find the first in each sequence. Then take the cumulative sum of the first flag to identify each group. Here is the code:

select MemberId, Diagnosis, min(ServiceDate) as EpisodeStartDate,
       max(ServiceStartDate) as EpisodeEndDate
from (select t.*, sum(ServiceStartFlag) over (partition by MemberId, Diagnosis order by ServiceDate) as grp
      from (select t.*,
                   (case when datediff(day,
                                       lag(ServiceDate) over (partition by MemberId, Diagnosis
                                                              order by ServiceDate),
                                       ServiceDate) < 90
                         then 0
                         else 1 -- handles both NULL and >= 90
                    end) as ServiceStartFlag
            from table t
           ) t
group by grp, MemberId, Diagnosis;

You can do this in earlier versions of SQL Server but the code is more cumbersome.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • This worked perfectly. I wasn't aware of the LAG() function, so I'm glad to know this kind of functionality exists. Thank you very much! – Uncle Milton Jun 02 '14 at 20:27
0

For versions of SQL Server prior to 2012, here's some code snippets that should work. First, you'll need a temp table (as opposed to a CTE, as the lookup of the edge event will fire the newid() function again, rather than retriving the value for that row)

DECLARE @Edges TABLE (MemberCode INT, Diagnosis VARCHAR(3), ServiceDate DATE, GroupID VARCHAR(40))

INSERT INTO @Edges
SELECT *
FROM Treatments E
    CROSS APPLY (
        SELECT 
            CASE
                WHEN EXISTS (
                    SELECT TOP 1 E2.ServiceDate
                    FROM Treatments E2
                    WHERE E.MemberCode = E2.MemberCode
                        AND E.Diagnosis = E2.Diagnosis
                        AND E.ServiceDate > E2.ServiceDate
                        AND DATEDIFF(dd,E2.ServiceDate,E.ServiceDate) BETWEEN 1 AND 90
                    ORDER BY E2.ServiceDate DESC
                    ) THEN 'Group'
                ELSE CAST(NEWID() AS VARCHAR(40))
            END AS GroupID
    ) z

The EXISTS operator contains a query that looks into the past for a date between 1 and 90 days ago. Once the Edge cases are gathered, this query will provide the results you posted as desired from the test data you posted.

SELECT MemberCode, Diagnosis, MIN(ServiceDate) AS StartDate, MAX(ServiceDate) AS EndDate
FROM (
    SELECT
          MemberCode
        , Diagnosis
        , ServiceDate
        , CASE GroupID
            WHEN 'Group' THEN (
                SELECT TOP 1 GroupID
                FROM @Edges E2
                WHERE E.MemberCode = E2.MemberCode
                    AND E.Diagnosis = E2.Diagnosis
                    AND E.ServiceDate > E2.ServiceDate
                    AND GroupID != 'Group'
                ORDER BY ServiceDate DESC
            )
            ELSE GroupID END AS GroupID
    FROM @Edges E
    ) Z
GROUP BY MemberCode, Diagnosis, GroupID
ORDER BY MemberCode, Diagnosis, MIN(ServiceDate)

Like Gordon said, more cumbersome, but it can be done if your server is not SQL 2012 or greater.

Jaaz Cole
  • 3,119
  • 1
  • 15
  • 20