Within a hospital encounter, a patient may be administered several different formulations of a medication, as shown here:
Encounter | Medication | Administration | Adm_Num |
---|---|---|---|
1 | A | 8/31/21 11:33 AM | 1 |
1 | B | 8/31/21 6:25 PM | 2 |
1 | C | 9/1/21 8:55 AM | 3 |
1 | D | 9/1/21 10:00 PM | 4 |
1 | B | 9/2/21 11:27 AM | 5 |
1 | B | 9/2/21 10:00 PM | 6 |
1 | B | 9/3/21 6:15 AM | 7 |
1 | B | 9/3/21 3:30 PM | 8 |
1 | D | 9/3/21 8:30 PM | 9 |
The task: For each encounter, I need to enumerate each formulation of the medication, like this:
Encounter | Medication | Administration | Adm_Num | Formulation |
---|---|---|---|---|
1 | A | 8/31/21 11:33 AM | 1 | 1 |
1 | B | 8/31/21 6:25 PM | 2 | 2 |
1 | C | 9/1/21 8:55 AM | 3 | 3 |
1 | D | 9/1/21 10:00 PM | 4 | 4 |
1 | B | 9/2/21 11:27 AM | 5 | 5 |
1 | B | 9/2/21 10:00 PM | 6 | 5 |
1 | B | 9/3/21 6:15 AM | 7 | 5 |
1 | B | 9/3/21 3:30 PM | 8 | 5 |
1 | D | 9/3/21 8:30 PM | 9 | 6 |
I need to preserve the administration-level granularity in order to report on those data (and I'm in good shape there).
One strategy I tried was a CTE to find a medication-level value for each medication, such as the first administration instant, then apply that to the more granular administration-level data, like this:
WITH f as (
SELECT a.Encounter, a.Medication, MIN(Administration) as First_Adm,
DENSE_RANK() OVER (PARTITION Encounter, Medication ORDER BY Administration) as Formulation
FROM a
GROUP BY a.Encounter, a.Medication
)
SELECT a.Encounter, a.Medication, a.Administration, a.Adm_Num, f.[First_Adm], f.Formulation
FROM a
INNER JOIN f ON (a.Encounter = f.Encounter AND a.Medication = f.Medication)
GROUP BY a.Encounter, a.Medication, a.Administration
ORDER BY a.Encounter, a.Medication, a.Administration
That worked well for the first few medications with single administrations, but it mishandled the reappearance of Medications B and D later in the encounter; it recognized Medication B from before, gave it the earlier MIN(Administration), and mislabeled it Formulation 2, as shown here:
Encounter | Medication | Administration | Adm_Num | First_Adm | Formulation |
---|---|---|---|---|---|
1 | A | 8/31/21 11:33 AM | 1 | 8/31/21 11:33 AM | 1 |
1 | B | 8/31/21 6:25 PM | 2 | 8/31/21 6:25 PM | 2 |
1 | C | 9/1/21 8:55 AM | 3 | 9/1/21 8:55 AM | 3 |
1 | D | 9/1/21 10:00 PM | 4 | 9/1/21 10:00 PM | 4 |
1 | B | 9/2/21 11:27 AM | 5 | 8/31/21 6:25 PM | 2 |
1 | B | 9/2/21 10:00 PM | 6 | 8/31/21 6:25 PM | 2 |
1 | B | 9/3/21 6:15 AM | 7 | 8/31/21 6:25 PM | 2 |
1 | B | 9/3/21 3:30 PM | 8 | 8/31/21 6:25 PM | 2 |
1 | D | 9/3/21 8:30 PM | 9 | 9/1/21 10:00 PM | 4 |
This is close, but not quite there. The remaining challenge is to account for repeated formulations.
How can I group the administration-level data by medication formulation groups, order the formulations chronologically (making use of the time data), and assign a sequential value to each formulation, even if some are repeated?
Thanks in advance for your help.