(python) I currently have a pandas dataframe that looks something like this:
player | year | points |
-----------------------------------------------
LeSean McCoy | 2012 | 199.3 |
-----------------------------------------------
LeSean McCoy | 2013 | 332.6 |
-----------------------------------------------
LeSean McCoy | 2014 | 200.4 |
-----------------------------------------------
I'm trying to add a new column to the dataframe that holds
a player's previous year points
.
I can do a groupby
that transforms the dataframe into one row in this example, with
each year
being its own column. However, I only want one added column, for example:
player | year | points | prev_year_pts |
-----------------------------------------------------------------------
LeSean McCoy | 2012 | 199.3 | 0 |
-----------------------------------------------------------------------
LeSean McCoy | 2013 | 332.6 | 199.3 |
-----------------------------------------------------------------------
LeSean McCoy | 2014 | 200.4 | 332.6 |
-----------------------------------------------------------------------
The true dataframe I'm working with has more than 300 unique player names, so I've been trying to get a solution on this example that would be able to also work with a different player name in the sample, with a desired output like this:
player | year | points | prev_year_pts |
------------------------------------------------------------------------------
LeSean McCoy | 2012 | 199.3 | 0 |
------------------------------------------------------------------------------
LeSean McCoy | 2013 | 332.6 | 199.3 |
------------------------------------------------------------------------------
LeSean McCoy | 2014 | 200.4 | 332.6 |
------------------------------------------------------------------------------
Christian McCaffrey | 2017 | 228.6 | 0 |
------------------------------------------------------------------------------
Christian McCaffrey | 2018 | 385.5 | 228.6 |
------------------------------------------------------------------------------
Christian McCaffrey | 2019 | 471.2 | 385.5 |
------------------------------------------------------------------------------
I've been able to add a prev_year
column with the following code:
example["prev_year"] = [x-1 for x in example.groupby(["player"])["year"].get_group("LeSean McCoy")]
But I'm stuck on how to get the prev_year_points
from that, and how to implement in a way
that could calculate that for each player
observation ...