0

Within a hospital encounter, a patient may be administered several different formulations of a medication, as shown here:

Encounter Medication Administration Adm_Num
1 A 8/31/21 11:33 AM 1
1 B 8/31/21 6:25 PM 2
1 C 9/1/21 8:55 AM 3
1 D 9/1/21 10:00 PM 4
1 B 9/2/21 11:27 AM 5
1 B 9/2/21 10:00 PM 6
1 B 9/3/21 6:15 AM 7
1 B 9/3/21 3:30 PM 8
1 D 9/3/21 8:30 PM 9

The task: For each encounter, I need to enumerate each formulation of the medication, like this:

Encounter Medication Administration Adm_Num Formulation
1 A 8/31/21 11:33 AM 1 1
1 B 8/31/21 6:25 PM 2 2
1 C 9/1/21 8:55 AM 3 3
1 D 9/1/21 10:00 PM 4 4
1 B 9/2/21 11:27 AM 5 5
1 B 9/2/21 10:00 PM 6 5
1 B 9/3/21 6:15 AM 7 5
1 B 9/3/21 3:30 PM 8 5
1 D 9/3/21 8:30 PM 9 6

I need to preserve the administration-level granularity in order to report on those data (and I'm in good shape there).

One strategy I tried was a CTE to find a medication-level value for each medication, such as the first administration instant, then apply that to the more granular administration-level data, like this:

WITH f as (
SELECT a.Encounter, a.Medication, MIN(Administration) as First_Adm,
 DENSE_RANK() OVER (PARTITION Encounter, Medication ORDER BY Administration) as Formulation
FROM a
GROUP BY a.Encounter, a.Medication
)
SELECT a.Encounter, a.Medication, a.Administration, a.Adm_Num, f.[First_Adm], f.Formulation
FROM a
INNER JOIN f ON (a.Encounter = f.Encounter AND a.Medication = f.Medication)
GROUP BY a.Encounter, a.Medication, a.Administration
ORDER BY a.Encounter, a.Medication, a.Administration

That worked well for the first few medications with single administrations, but it mishandled the reappearance of Medications B and D later in the encounter; it recognized Medication B from before, gave it the earlier MIN(Administration), and mislabeled it Formulation 2, as shown here:

Encounter Medication Administration Adm_Num First_Adm Formulation
1 A 8/31/21 11:33 AM 1 8/31/21 11:33 AM 1
1 B 8/31/21 6:25 PM 2 8/31/21 6:25 PM 2
1 C 9/1/21 8:55 AM 3 9/1/21 8:55 AM 3
1 D 9/1/21 10:00 PM 4 9/1/21 10:00 PM 4
1 B 9/2/21 11:27 AM 5 8/31/21 6:25 PM 2
1 B 9/2/21 10:00 PM 6 8/31/21 6:25 PM 2
1 B 9/3/21 6:15 AM 7 8/31/21 6:25 PM 2
1 B 9/3/21 3:30 PM 8 8/31/21 6:25 PM 2
1 D 9/3/21 8:30 PM 9 9/1/21 10:00 PM 4

This is close, but not quite there. The remaining challenge is to account for repeated formulations.

How can I group the administration-level data by medication formulation groups, order the formulations chronologically (making use of the time data), and assign a sequential value to each formulation, even if some are repeated?

Thanks in advance for your help.

Charlieface
  • 52,284
  • 6
  • 19
  • 43
  • Tbh, despite reading this twice I still don't understand exactly what a "formulation" is - in terms of the data. For example, why is the expected value for "D | 9/1/21 10:00 PM" `4`, but for "D | 9/3/21 8:30 PM" the expected value is `6`? Yet the expected value for both "B | 9/2/21 10:00 PM" and "B | 9/2/21 11:27 AM" - is `5`? Seems like there's a missing piece of information – SOS Mar 29 '22 at 16:47
  • 1
    "Formulation" as I understand it is the treatment of a medication to a patient for a period of time. My first table illustrates the desired result. Medication B is both the second and fifth formulation, and Medication D is both the fourth and sixth formulations. – user18612109 Mar 29 '22 at 17:09

1 Answers1

2

This is a type of "gaps and islands" problem.

One method is to first use lag or lead to check the adjacent row to identify where the data (Medication) changes, followed by summing these values for all preceding rows for each row in the results to generate the desired sequence:

with c as (
    select *, 
      case when 
        Lag(medication) over (partition by encounter order by Administration) = medication 
      then 0 else 1 end Changed
    from t
)
select Encounter, Medication, Administration, Adm_Num,
  Sum(changed) over(partition by Encounter order by Administration) Formulation
from c;
Stu
  • 30,392
  • 6
  • 14
  • 33