0

I have this dataframe and I want to find out who secured maximum points in each city.

import pandas as pd 

df = pd.DataFrame({"City":["Delhi","Delhi","Mumbai","Mumbai","Lahore","Lahore"],
"Points":[90.1,90.3,94.1,95,89,90.5],
"Gender":["Male","Female","Female","Male","Female","Male"]})

So far I have tried this, but this is giving me the toppers of both gender[male and female] in each city,

df.groupby(by=["City","Gender"],sort=False)["Points"].max()

Output:

I want a single candidate[male or female] who scored maximum points in each city.

2 Answers2

1

try using a pivot table with only 'City' as the index:

df.pivot_table(values='Points',index=['City'],aggfunc='max')

#Output

        Points
City          
Delhi     90.3
Lahore    90.5
Mumbai    95.0

If you wanna use groupby, feel free to do that but again, remember to use only 'City' as here, the gender does not matter

df.groupby(by=["City"],sort=False)["Points"].max()
#Output
City
Delhi     90.3
Mumbai    95.0
Lahore    90.5
Name: Points, dtype: float64

If you want to see the gender also, use this:

df.groupby('City').apply(lambda x: x.sort_values('Points').iloc[-1].reset_index(drop=True))

Output

    0   1   2
City            
Delhi   Delhi   90.3    Female
Lahore  Lahore  90.5    Male
Mumbai  Mumbai  95.0    Male
Vas
  • 130
  • 10
0

This one will give you a table of the highest scores only

import pandas as pd 

df = pd.DataFrame({
    "City":["Delhi","Delhi","Mumbai","Mumbai","Lahore","Lahore"],
    "Points":[90.1,90.3,94.1,95,89,90.5],
    "Gender":["Male","Female","Female","Male","Female","Male"]
})

df.groupby('City').apply(lambda x: x.sort_values('Points').tail(1)).reset_index(drop=True)

Output:

     City  Points  Gender
0   Delhi    90.3  Female
1  Lahore    90.5    Male
2  Mumbai    95.0    Male
bottledmind
  • 603
  • 3
  • 10