1

I have a large df with keys that occur multiple times and values. Here is an example:

index key  value
0     1    346
1     1    349
2     1    351
3     1    353
4     1    355
5     2    359
6     2    359
7     2    360
8     2    365
9     2    365
10    2    366
11    2    369

I identified one value in each keys-group (for example: key 1:value 349, key 2:value 365) and want to apply a function that only retains the rows in each group from that specific value to the max value in that group (for example from group 1 the rows from index 2-5 must remain and from group 2 the rows from index 9 to 12 shall remain) leaving the following:

index  key  value
2    1    349
3    1    351
4    1    353
5    1    355
9    2    365
10    2    365
11    2    366
12    2    369

Thanks for your help in advance!

I tried using groupby.apply but as each identified value in each group is different, I think I rather need a for loop or something. Generally speaking I think I have a tuple where each value belongs to a specific key.

I cannot figure out how it works. Please help!!

Scott Boston
  • 147,308
  • 15
  • 139
  • 187
Hyphens
  • 11
  • 2

1 Answers1

3

You can map the threshold and use boolean indexing to keep the values greater or equal:

out = df[df['value'].ge(df['key'].map({1: 349, 2: 365}))]
mozway
  • 194,879
  • 13
  • 39
  • 75