0

I want to check all fields with a specific ID and delete the rows if their values are null in another column.

for example in the below table, I'm checking the Id (84) fields in patientId column, if all fields are null in pathology column, then i should delete this rows.

enter image description here

Thanks!

Dávid Pásztor
  • 51,403
  • 9
  • 85
  • 116
Reda Khalaf
  • 115
  • 2
  • 10
  • you could use `groupby("patientId")` and in every group check if there are values different then `null` – furas May 26 '22 at 10:50

2 Answers2

1
df = pd.DataFrame({
    'id': [1,1,2,2],
    'value': [1,None,2,None]
})
df = df[~((df['id'] == 1) & df['value'].isna())]
print(df.to_markdown(index=False))
|   id |   value |
|-----:|--------:|
|    1 |       1 |
|    2 |       2 |
|    2 |     nan |
ThSorn
  • 507
  • 4
  • 7
0

I find duplicate strings where there is null. I get labels by comparing the main data frame([False False True False False True False False True]). Apply masking using ~, that is, I show rows with opposite indexes.

import pandas as pd

df = pd.DataFrame({'patientld': [89, 84, 84, 9, 9, 84, 5, 3, 84],
                   'pathology': ['null', 'null', 'null', 'yes', 'null', 'null', 'yes', 'yes', 'null']})

a = df[(df.duplicated()) & (df['pathology'].isin(['null']))]
index = df.index.isin(a.index)
print(df[~index])

Output

   patientld pathology
0         89      null
1         84      null
3          9       yes
4          9      null
6          5       yes
7          3       yes
inquirer
  • 4,286
  • 2
  • 9
  • 16