0

I have a simple table that is used to record events against specific visits:

Describe Histories;
    +------------------+
    | Field            |
    +------------------+
    | HistoryId        |
    | VisitId          |
    | Location         |
    | Event            |
    | EventTime        |
    +------------------+

Individuals are associated with Visits (VisitId). For each Visit, an individual may have multiple History records. Events can be Admission, Transfer or Discharge.

I am trying to write a Query to calculate the duration in each Location for each individual. Note, that they may visit a Location multiple times for each Visit. An individual enters a Location with an Admission or Transfer Event and leaves with a Discharge or Transfer.

If an individual enters Location 'A', their Admission or Transfer record will list Location 'A', however if they transfer out their transfer out (or discharge) will list another Location, say 'B'.

I therefore have to find the interval between the transfer into Location 'A' and the subsequent (in time) transfer to Location 'B'. Intra location transfers are not evaluated.

I understand that the solution will probably be based on an INNER JOIN, however I am at a loss to understand how to select for the transfer "out" record that corresponds to the most recent transfer "in".

I guess that this is reasonably complex - I hope I have been clear enough in my explanation.

Any guidance greatly appreciated.

Andriy M
  • 76,112
  • 17
  • 94
  • 154
skyman
  • 2,255
  • 4
  • 32
  • 55

2 Answers2

1

Assuming a transfer or discharge is a unique event you could write like so

SELECT
   b.EventTime - a.EventTime        
FROM
   Histories a
   INNER JOIN Histories b
   ON a.VisitID = b.VisitID
WHERE
   a.event = 'Admission'
   and
   b.event in ('Transfer', 'Discharge')

If you were interested in the last transfer or discharge you would write

SELECT
   b.EventTime - a.EventTime        
FROM
   Histories a
   INNER JOIN  Histories b
    ON a.VisitID = b.VisitID

   INNER JOIN 
   (SELECT
         VisitId, 
         MAX(HistoryID) HistoryID
    FROM Histories 
    WHERE 
       b.event in ('Transfer', 'Discharge')
    GROUP BY 
       VisitId) maxHistory
   ON b.HistoryID = maxHistoryId.HistoryId

WHERE
   a.event = 'Admission'

However if a Visit can result in multiple visits as Andriy M mentions you have a Gaps And islands problem (specifically the islands)

In that case you want the following

SELECT  
       a.VisitId,
       a.Event a_Event, 
       a.Event b_Event, 
       a.EventTime a_EventTime,
       b.EventTime b_EventTime,
       b_EventTime - a_EventTime

FROM   histories a 
       INNER JOIN histories B 
         ON a.visitID = b.visitID 
            AND a.EventTime < b.eventTime 
       INNER JOIN (SELECT a.VisitId, 
                          a.EventTime      a_EventTime, 
                          Min(b.EventTime) b_EventTime 
                   FROM   histories a 
                          INNER JOIN histories B 
                            ON a.visitID = b.visitID 
                               AND a.EventTime < b.eventTime 
                   GROUP  BY a_EventTime, 
                             a.VisitId) MinTime 
         ON a.VisitID = MinTime.VisitID 
            AND a.EventTime = a_EventTime 
            AND b.EventTime = b_EventTime 

DEMO

Using the following sample data

CREATE TABLE Histories 
    (
     HistoryId int auto_increment primary key, 
     VisitId int,
     Location varchar(20),
     Event varchar(20), 
     EventTime datetime
    );

INSERT INTO Histories
(VisitId, Location, Event, EventTime)
VALUES
(1, 'A', 'Admission', '2012-01-01'),
(1, 'A', 'Discharge', '2012-01-03'),
(2, 'B', 'Admission', '2012-01-02'),
(2, 'C', 'Transfer', '2012-01-05'),
(2, 'C', 'Discharge', '2012-01-06'),
(3, 'D', 'Admission', '2012-01-06'),
(3, 'E', 'Transfer', '2012-01-07'),
(3, 'F', 'Transfer', '2012-01-08'),
(3, 'F', 'Discharge', '2012-01-10');

You get the following results

VISITID    A_EVENT   B_EVENT    A_EVENTTIME                     B_EVENTTIME                     B_EVENTTIME - A_EVENTTIME
1          Admission Discharge  January, 01 2012 00:00:00-0800  January, 03 2012 00:00:00-0800  2000000
2          Admission Transfer   January, 02 2012 00:00:00-0800  January, 05 2012 00:00:00-0800  3000000
2          Transfer  Discharge  January, 05 2012 00:00:00-0800  January, 06 2012 00:00:00-0800  1000000
3          Admission Transfer   January, 06 2012 00:00:00-0800  January, 07 2012 00:00:00-0800  1000000
3          Transfer  Transfer   January, 07 2012 00:00:00-0800  January, 08 2012 00:00:00-0800  1000000
3          Transfer  Discharge  January, 08 2012 00:00:00-0800  January, 10 2012 00:00:00-0800  2000000

Notes:

  • This assumes you don't care about Admissions/Transger that don't have a corresponding discharge/transfer yet.
  • If you know that eventTime doesn't change after the record is entered you could use historyID instead of eventime to determine the order of events.
  • You know how to get the Event Time difference in the format that you like
Conrad Frix
  • 51,984
  • 12
  • 96
  • 155
  • Thank you for this. The first example was very helpful, although the second is more towards the structure I am looking for. I think there are still two areas where I am struggling. The main one is that I need to associate the an Admission or transfer into a location with the next transfer or discharge out rather than the most recent in your example. A single individual could have multiple 'in/out' events for a single location. – skyman May 07 '12 at 06:13
  • @bugy: Can there be multiple transfers to the same location for the same `VisitId`? – Andriy M May 07 '12 at 07:03
  • Yes. An individual may visit a 'Clinic'. Visiting the Clinic is associated with a VisitId. However, within the visit, they may be moved from one room to another and possibly back again to one they have already been in. Each time they move in to a room a History record is created (Admission or transfer). Each time they leave a history record is created (transfer or discharge). The problem is to determine the length of stay for a specific room. This comes down to the difference in time between an 'in' event and the subsequent 'out' event. – skyman May 07 '12 at 07:26
  • @bugy: I added the `gaps-and-islands` tag to your post, because that's the kind of problem your question seems to boil down to. There are not many questions on SO related to the same problem in the same SQL product, still you might want to [have a look at them](http://stackoverflow.com/questions/tagged/gaps-and-islands+mysql), all the more so since some of them contain accepted answers. – Andriy M May 07 '12 at 09:51
  • @AndriyM I was hoping that this wasn't a g&i problem when I wrote my initial answer. Ah well so much for that – Conrad Frix May 07 '12 at 15:11
  • @Conrad: Thank you for the time you have spent to assist with this. It is exactly the solution I was looking for. The only change I have made thus far is to add a.Location != b.Location to filter for intra-location moves. The EventTime does not change once the record is created so I will also try using the HistoryId. – skyman May 08 '12 at 08:32
1

How does this work for you?

SELECT 
    h1.HistoryId, 
    h1.VisitId, 
    h1.Event AS InitialEvent, 
    h2.Event AS FinalEvent, 
    h1.Location AS StartLocation,
    h2.Location AS EndLocation,
    IF(h2.HistoryId, UNIX_TIMESTAMP(h2.EventTime) - UNIX_TIMESTAMP(h1.EventTime), NULL) AS transfer_duration_seconds
FROM Histories h1 
LEFT JOIN Histories h2 ON h1.VisitId = h2.VisitId AND h1.Location != h2.location AND h2. EventTime > h1. EventTime
GROUP BY h1.VisitId
Bryan
  • 6,682
  • 2
  • 17
  • 21
  • Thank you Bryan - this also makes sense. However, as with COnrad I am struggling to relate the InitialEvent with the SubsequentEvent - there could be multiple pairings for each VisitId. – skyman May 07 '12 at 06:17
  • I'm slightly confused, my query ought to grab each event from history, and join on the next consecutive Event according to the EventTime. Is it doing something other than that? – Bryan May 07 '12 at 12:27