I have a sql data (table) which consists of n
number of columns, but 3 are of particular interest
claim_id | patient_id | admission_date |
---|---|---|
1 | p_1 | 2001-01-01 |
2 | p_2 | 2001-02-01 |
3 | p_1 | 2001-01-01 |
4 | p_3 | 2001-02-02 |
5 | p_2 | 2001-03-01 |
6 | p_2 | 2001-04-01 |
7 | p_1 | 2001-04-01 |
What I want is remove first row (based on admission_date
) of every patient_id
. In case there are two rows for any patient on his/her first date of admission (e.g. patient_id = p_1
having 2 rows 1 & 3
with same dates of admission) then remove all those rows; and get an output like this
claim_id | patient_id | admission_date |
---|---|---|
5 | p_2 | 2001-03-01 |
6 | p_2 | 2001-04-01 |
7 | p_1 | 2001-04-01 |
I am not very well conversant with sql, but I can do that easily in R
using duplicated
. Is something like that, a memory efficient (I have to filter 10 million rows at least) is available in POSTGRESQL?
What I have tried? Basically nothing in sql. If that data was of less size I could have exported it in R and have done my analysis there.
Please help.