0

Given a data frame like this one :

      player  draftyear  draftpick  season    team  games  games_start  mins  points
Luke Babbitt       2010       16.0    2010    POR     24            0   137      36
Luke Babbitt       2010       16.0    2011    POR     40            4   537     202
Luke Babbitt       2010       16.0    2012    POR     62            0   730     244
Luke Babbitt       2010       16.0    2013    NOP     27            2   473     170
Luke Babbitt       2010       16.0    2014    NOP     63           19   830     256
Luke Babbitt       2010       16.0    2015    NOP     47           13   845     327
Luke Babbitt       2010       16.0    2016    MIA     68           55  1065     324
Luke Babbitt       2010       16.0    2017    ATL     37            9   570     226
Luke Babbitt       2010       16.0    2017    MIA     13            5   145      33

We see that Luke Babbitt played for two teams in the 2017 season. I would like to condense the stats in the season where he played on multiple teams.

The way I would like to condense it would preserve player, draftyear, draftpick, season. I would then like to make a list out of the teams he played for, and then add together games, games_Start, mins, and points.

The output should look something like this:

      player  draftyear  draftpick  season             team  games  games_start  mins  points
Luke Babbitt       2010       16.0    2010              POR     24            0   137      36
Luke Babbitt       2010       16.0    2011              POR     40            4   537     202
Luke Babbitt       2010       16.0    2012              POR     62            0   730     244
Luke Babbitt       2010       16.0    2013              NOP     27            2   473     170
Luke Babbitt       2010       16.0    2014              NOP     63           19   830     256
Luke Babbitt       2010       16.0    2015              NOP     47           13   845     327
Luke Babbitt       2010       16.0    2016              MIA     68           55  1065     324
Luke Babbitt       2010       16.0    2017   ['ATL', 'MIA']     50           14   715     259

So far the only thing I have is:

df.groupby(['player', 'season'], as_index=False)[['games', 'games_start', 'mins','points']].agg(sum)

However I am having a hard time figuring out how to do the other two functions in the same line, where I keep certain entries the same and for the other ones I turn it into a list.

wjandrea
  • 28,235
  • 9
  • 60
  • 81
  • `df.groupby(['player', 'season'],as_index=False).agg({ 'draftyear' : 'first', 'draftpick' : 'first', 'team' : list, 'games' : 'sum', 'games_start' : 'sum', 'mins' : 'sum', 'points' : 'sum' })` – Nick Aug 28 '23 at 03:34

0 Answers0