0

I am new to SQL, so excuse any lapse of notation. A much simplified version of my problem is as follows. I have hospital admissions in table ADMISSIONS and need to collect the most recent outpatient claim of a certain type from table CLAIMS prior to the admission date:

SELECT a.ID , a.date, b.claim_date
FROM admissions as a
LEFT JOIN claims b on  (a.ID=b.ID) and (a.date>b.claim_date) 
LEFT JOIN claims c on ((a.ID=c.ID) and (a.date>c.claim_date))
     and (b.claim_date<c.claim_date or b.claim_date=c.claim_date and b.ID<c.ID)
WHERE c.ID is NULL

The problem is that for some IDs I get many records with duplicate a.date, c.claim_date values.

My problem is similar to one discussed here

SQL join: selecting the last records in a one-to-many relationship

and elaborated on here

SQL Left join: selecting the last records in a one-to-many relationship

However, there is the added wrinkle of looking only for records in CLAIMS that occur prior to a.date and I think that is causing the problem.

Update

Times are not stored, just dates, and since a patient can have multiple records on the same day, it's an issue. There is another wrinkle, which is that I only want to look at a subset of CLAIMS (let's say claims.flag=TRUE). Here's what I tried last:

SELECT a.ID , a.date, b.claim_date
FROM admissions as a
LEFT JOIN (
       select d.ID , max(d.claim_date) cdate
       from claims as d
       where d.flag=TRUE
       group by d.ID
       ) as b on  (a.ID=b.ID) and (b.claim_date < a.date) 
LEFT JOIN claims c on ((a.ID=c.ID) and (c.claim_date < a.claim_date))
     and c.flag=TRUE
     and (b.claim_date<c.claim_date or b.claim_date=c.claim_date and b.ID<c.ID)
WHERE c.ID is NULL

However, this ran for a couple of hours before aborting (typically takes about 30 mins with LIMIT 10).

Community
  • 1
  • 1
Jeph
  • 1
  • 2

1 Answers1

1

You may want to try using a subquery to solve this problem:

SELECT a.ID, a.date, b.claim_date
  FROM admissions as a
  LEFT JOIN claims b ON (a.ID = b.ID)
  WHERE b.claim_date = (
    SELECT MAX(c.claim_date) 
      FROM claims c 
      WHERE c.id = a.id -- Assuming that c.id is a foreign key to a.id
        AND c.claim_date < a.date  -- Claim date is less than admission date
  );

An attempt to clarify with different IDs, and using an additional subquery to account for duplicate dates:

SELECT a.ID, a.patient_id, a.date, b.claim_id, b.claim_date
  FROM admissions as a
  LEFT JOIN claims b ON (a.patient_ID = b.patient_ID)
  WHERE b.claim_id = (
    SELECT MAX(c.claim_id)  -- Max claim identifier (likely most recent if sequential)
      FROM claims c 
      WHERE c.patient_ID = a.patient_ID
                    AND c.flag = TRUE
                    AND c.claim_date = (
                        SELECT MAX(d.claim_date) 
                            FROM claims d
                            WHERE d.patient_id = c.patient_id 
                                AND c.claim_date < a.date  -- Claim date is less than admission date
                                AND d.flag = TRUE
                    )
  )
        b.flag = TRUE;
MeyerRJ
  • 792
  • 4
  • 8
  • This looks promising, but I get an error that a.ID doesn't exist.I think because c.id is not a foreign key? – Jeph Oct 03 '13 at 21:27
  • Also, there are typically multiple claims on the same date, and I think this would result in duplicates. I just the last date. – Jeph Oct 03 '13 at 21:39
  • Without an accurate description of the database tables and their relationships, it would be difficult to write a query with the proper joins/relationships. Any chance you can provide more details for the admissions/claims tables? – MeyerRJ Oct 03 '13 at 21:40
  • Regarding the claim_date and duplicates, do you know if it is storing just the date, or is time information stored as well? If there are multiple claims and they're all stored at midnight/zero-hour then yes, duplicates will be an issue. If there is an additional sequence number that can be used, you could add another subquery to select the max sequence for the date. To check check if times are stored: SELECT TO_CHAR('YYYY-MM-DD HH24:MI:SS', claim_date) FROM claims; – MeyerRJ Oct 03 '13 at 22:07
  • Thanks for the help. I added some more info, and another attempt with a little more detail about the problem. The actual query is about 50 lines, but it's only when I added the last JOIN that the behaviour stopped making sense. – Jeph Oct 04 '13 at 00:49
  • The joining of the ID columns is quite confusing in these queries. I've updated the query and I've made a distinction between a possible use of patient_ids and claim_ids. – MeyerRJ Oct 04 '13 at 04:14