Python Pandas groupby agg or transform on max value_counts to drop duplicate rows

Question

I have this df, and want to drop duplicates based on the max value counts of 'rating' (its binary field). None of the drop_duplicates with combination of grouby, max, count isn't fecthing the desired output. Any suggestion highly appreciated.

df = pd.DataFrame({
     'brand': ['Yum Yum', 'Yum Yum', 'Yum Yum', 'Indomie', 'Indomie', 'Indomie', 'Cup','Pack'],
     'rating': [1, 1, 0, 1, 0, 0, 1, 0]})

Desired Output :

brand	rating
Yum Yum	1
Indomie	0
Cup	1
Pack	0

tried below but not helping with desired output.

df.groupby(["brand",'rating']).max().reset_index()
df.drop_duplicates(subset=['brand'], keep='last')

score 1 · Accepted Answer · answered May 09 '23 at 14:28

Since the rating is binary, you can do mean and compare to 0.5:

df.groupby('brand', sort=False)['rating'].mean().ge(0.5).astype(int).reset_index()

In general, you can use mode:

df.groupby('brand', sort=False)['rating'].apply(lambda x: x.mode()[0])

Output:

     brand  rating
0  Yum Yum       1
1  Indomie       0
2      Cup       1
3     Pack       0

Python Pandas groupby agg or transform on max value_counts to drop duplicate rows

1 Answers1