1

I have a dataframe with 3 columns: I want to make a rule that if for a same city and same id, pick the maximum value and drop the row with lower value.

eg:

City ID Value
London 1 12.45
Amsterdam 1 14.56
Paris 1 16.89
New York 1 23.86
Chicago 1 14.56
Chicago 1 20.76

Expected Output : Select highest value for same city and ID. Here Chicago has 2 entries with same ID, I want to select the row with highest value.

Expected Output

City ID Value
London 1 12.45
Amsterdam 1 14.56
Paris 1 16.89
New York 1 23.86
Chicago 1 20.76
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91

1 Answers1

1

Sort the dataframe by Value in descending order and drop the duplicates:

df = df.sort_values(by=['Value'], ascending=False).drop_duplicates(['City', 'ID'], keep='first')
print(df)

Prints:

        City  ID  Value
3   New York   1  23.86
5    Chicago   1  20.76
2      Paris   1  16.89
1  Amsterdam   1  14.56
0     London   1  12.45
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91