Selecting row with highest value based on two different columns

Question

I have a dataframe with 3 columns: I want to make a rule that if for a same city and same id, pick the maximum value and drop the row with lower value.

eg:

City	ID	Value
London	1	12.45
Amsterdam	1	14.56
Paris	1	16.89
New York	1	23.86
Chicago	1	14.56
Chicago	1	20.76

Expected Output : Select highest value for same city and ID. Here Chicago has 2 entries with same ID, I want to select the row with highest value.

Expected Output

City	ID	Value
London	1	12.45
Amsterdam	1	14.56
Paris	1	16.89
New York	1	23.86
Chicago	1	20.76

score 1 · Accepted Answer · answered Jul 03 '23 at 15:59

Sort the dataframe by Value in descending order and drop the duplicates:

df = df.sort_values(by=['Value'], ascending=False).drop_duplicates(['City', 'ID'], keep='first')
print(df)

Prints:

        City  ID  Value
3   New York   1  23.86
5    Chicago   1  20.76
2      Paris   1  16.89
1  Amsterdam   1  14.56
0     London   1  12.45

Selecting row with highest value based on two different columns

1 Answers1