1

I'm a new Python learner, and I can't figure out how to do this.

Let's say I have a data frame like this:

week  category  title  hours_viewed
1     Eng       aaa    100
2     Eng       aaa    95
3     Non-Eng   bbb    105
4     Non-Eng   bbb    100
5     Eng       ccc    80
6     Eng       ccc    115

I want to select rows only for each title with most hours_viewed, the result will look like this:

week  category  title  hours_viewed
1     Eng       aaa    100
3     Non-Eng   bbb    105
6     Eng       ccc    115

Thank you in advance.

Chi Wong
  • 57
  • 3

1 Answers1

0

here is one way to do it

using groupby, we take the max across category & title and then comparing the result with the original df.

df[df['hours_viewed'].eq(df.groupby(['category','title']).hours_viewed.transform(max))]
    week    category    title   hours_viewed
0   1       Eng         aaa     100
2   3       Non-Eng     bbb     105
5   6       Eng         ccc     115
Naveed
  • 11,495
  • 2
  • 14
  • 21