I am new to SQL, so excuse any lapse of notation. A much simplified version of my problem is as follows. I have hospital admissions in table ADMISSIONS and need to collect the most recent outpatient claim of a certain type from table CLAIMS prior to the admission date:
SELECT a.ID , a.date, b.claim_date
FROM admissions as a
LEFT JOIN claims b on (a.ID=b.ID) and (a.date>b.claim_date)
LEFT JOIN claims c on ((a.ID=c.ID) and (a.date>c.claim_date))
and (b.claim_date<c.claim_date or b.claim_date=c.claim_date and b.ID<c.ID)
WHERE c.ID is NULL
The problem is that for some IDs I get many records with duplicate a.date, c.claim_date values.
My problem is similar to one discussed here
SQL join: selecting the last records in a one-to-many relationship
and elaborated on here
SQL Left join: selecting the last records in a one-to-many relationship
However, there is the added wrinkle of looking only for records in CLAIMS that occur prior to a.date and I think that is causing the problem.
Update
Times are not stored, just dates, and since a patient can have multiple records on the same day, it's an issue. There is another wrinkle, which is that I only want to look at a subset of CLAIMS (let's say claims.flag=TRUE). Here's what I tried last:
SELECT a.ID , a.date, b.claim_date
FROM admissions as a
LEFT JOIN (
select d.ID , max(d.claim_date) cdate
from claims as d
where d.flag=TRUE
group by d.ID
) as b on (a.ID=b.ID) and (b.claim_date < a.date)
LEFT JOIN claims c on ((a.ID=c.ID) and (c.claim_date < a.claim_date))
and c.flag=TRUE
and (b.claim_date<c.claim_date or b.claim_date=c.claim_date and b.ID<c.ID)
WHERE c.ID is NULL
However, this ran for a couple of hours before aborting (typically takes about 30 mins with LIMIT 10).