Python CSV check fields with same id in column if value is null in another column

Question

I want to check all fields with a specific ID and delete the rows if their values are null in another column.

for example in the below table, I'm checking the Id (84) fields in patientId column, if all fields are null in pathology column, then i should delete this rows.

Thanks!

you could use `groupby("patientId")` and in every group check if there are values different then `null` — furas, May 26 '22 at 10:50

score 1 · Answer 1 · answered May 26 '22 at 10:27

df = pd.DataFrame({
    'id': [1,1,2,2],
    'value': [1,None,2,None]
})
df = df[~((df['id'] == 1) & df['value'].isna())]
print(df.to_markdown(index=False))
|   id |   value |
|-----:|--------:|
|    1 |       1 |
|    2 |       2 |
|    2 |     nan |

score 0 · Accepted Answer · answered May 26 '22 at 13:06

I find duplicate strings where there is null. I get labels by comparing the main data frame([False False True False False True False False True]). Apply masking using ~, that is, I show rows with opposite indexes.

import pandas as pd

df = pd.DataFrame({'patientld': [89, 84, 84, 9, 9, 84, 5, 3, 84],
                   'pathology': ['null', 'null', 'null', 'yes', 'null', 'null', 'yes', 'yes', 'null']})

a = df[(df.duplicated()) & (df['pathology'].isin(['null']))]
index = df.index.isin(a.index)
print(df[~index])

Output

   patientld pathology
0         89      null
1         84      null
3          9       yes
4          9      null
6          5       yes
7          3       yes

Python CSV check fields with same id in column if value is null in another column

2 Answers2