How to remove pandas dataframe records by key and subject to condition

Question

I have a pandas dataframe that I have created as follows:

import pandas as pd


ds1 = {'col1':[1,1,1,1,1,1,1, 2,2,2,2,2,2,2], "col2" : [1,1,0,1,1,1,1,1,1,0,1,1,1,1]}

df1 = pd.DataFrame(data=ds1)
print(df1)

    col1  col2
0      1     1
1      1     1
2      1     0
3      1     1
4      1     1
5      1     1
6      1     1
7      2     1
8      2     1
9      2     0
10     2     1
11     2     1
12     2     1
13     2     1

As soon as col2 is equal to 0, I want to remove all of the subsequent records BY the same col1 values. In this case, the resulting dataframe would look like this:

    col1  col2
0      1     1
1      1     1
2      1     0
7      2     1
8      2     1
9      2     0

PaulS · Accepted Answer · 2023-04-27T14:51:21.900

1

A possible solution:

(df1.groupby(['col1'], group_keys=False)
 .apply(lambda g: g[~g['col2'].eq(0).cummax().shift(fill_value=False)]))

Alternatively:

g = df1.groupby(['col1'])

pd.concat([x[1][~x[1]['col2'].eq(0).cummax().shift(fill_value=False)] for x in g])

Output:

   col1  col2
0     1     1
1     1     1
2     1     0
7     2     1
8     2     1
9     2     0

edited Apr 27 '23 at 14:51

answered Apr 27 '23 at 14:45

PaulS

21,159
2
9
26

How to remove pandas dataframe records by key and subject to condition

1 Answers1