I have some data which I'm trying to clean in order to run further analysis on. One of the columns in the table is called record_type, this can either be NEW or DEL. This means that initially a NEW record might be added but then a DEL record would come in later to say that particular record is now expired (NEW and DEL records would be matched on the record_id). However both the NEW and DEL record would stay in the data, it doesn't get deleted.
So what I had planned to do is to create two CTEs, one for DEL records only and one for NEW records only:
WITH deleted_rec AS(
SELECT *
FROM main_table
WHERE record_type = 'DEL'
)
, new_rec AS(
SELECT *
FROM main_table
WHERE record_type = 'NEW'
)
Then outer join on the record_id column in both the CTEs.
SELECT *
FROM new_rec
FULL OUTER JOIN deleted_rec ON deleted_rec.record_id = new_rec.record_id
The goal would have been for the output to only include records which haven't had a DEL record come in for that record_id so that way I can guarantee that all the type NEW records I have in my final table would not have had a DEL record come in for them at any point and they would therefore all be active. However, I had forgotten that FULL OUTER JOIN return everything rather than just what didn't match so is there a way to get around this to get my desired output?