0

I have some football data that I am modifying for analysis. I basically want to calculate career and yearly per game averages on a weekly basis for several stats.

Example

What I have:

Player Year Week Rushing Yards Catches
Seth Johnson 2020 1 100 4
Seth Johnson 2020 2 80 2
Seth Johnson 2021 1 50 3
Seth Johnson 2021 2 50 2

What I want:

Player Year Week Rushing Yards Catches Career Rushing Yards per Game Career Catches per Game Yearly Rushing Yards per Game Yearly Catches per Game
Seth Johnson 2020 1 100 4 100 4 100 4
Seth Johnson 2020 2 80 2 90 3 90 3
Seth Johnson 2021 1 50 3 76.67 3 50 3
Seth Johnson 2021 2 40 2 67.5 2.75 45 2.5

I figure I could calculate the Career stats and Yearly stats separately then just join everything on Player/Year/Week, but I'm not sure how to go about calculating the moving averages given that the window would be dependant on Year and Week.

I've tried things like looping through the desired categories and calculating rolling averages:

new_df['Career ' + category + ' per Game'] = df.groupby('Player')[category].apply(lambda x: x.rolling(3, min_periods=0).mean())

But I'm not finding the creativity necessary to make the appropriate custom window for rolling(). Does anyone have any ideas here?

kh7
  • 1
  • Why is the last row "Career Rushing Yards per Game" 67.5? If it's a 3 period rolling mean, shouldn't it be (80+50+50)/3=60? The other columns also do not seem to be following your 3-period-mean logic. – not_speshal Nov 29 '21 at 20:28
  • Sorry, the example is unclear. The table shows the correct desired output with the career rolling means averaging each previous week. The code sample is just an attempt at a solution and is not related to the tables. – kh7 Nov 29 '21 at 20:34
  • Can you please post a better example where the rolling window actually changes? And also please make the expected output match the inputs. – not_speshal Nov 29 '21 at 20:37

0 Answers0