Only keep certain rows in group from specific value to max in that group

Question

I have a large df with keys that occur multiple times and values. Here is an example:

index key  value
0     1    346
1     1    349
2     1    351
3     1    353
4     1    355
5     2    359
6     2    359
7     2    360
8     2    365
9     2    365
10    2    366
11    2    369

I identified one value in each keys-group (for example: key 1:value 349, key 2:value 365) and want to apply a function that only retains the rows in each group from that specific value to the max value in that group (for example from group 1 the rows from index 2-5 must remain and from group 2 the rows from index 9 to 12 shall remain) leaving the following:

index  key  value
2    1    349
3    1    351
4    1    353
5    1    355
9    2    365
10    2    365
11    2    366
12    2    369

Thanks for your help in advance!

I tried using groupby.apply but as each identified value in each group is different, I think I rather need a for loop or something. Generally speaking I think I have a tuple where each value belongs to a specific key.

I cannot figure out how it works. Please help!!

You can [edit](https://stackoverflow.com/posts/74415104/edit) your question — mozway, Nov 12 '22 at 17:48

score 3 · Answer 1 · answered Nov 12 '22 at 17:35

3

You can map the threshold and use boolean indexing to keep the values greater or equal:

out = df[df['value'].ge(df['key'].map({1: 349, 2: 365}))]

answered Nov 12 '22 at 17:35

mozway

194,879
13
39
75

Only keep certain rows in group from specific value to max in that group

1 Answers1