I have the following dataframe which I am trying to setup to use in some regression analysis:
Date Person Feature1 Feature2 .... Feature100
1/1/2020 Jim 12 15 82
1/7/2020 Jim 1 25 84
1/1/2021 Jim 12 15 85
1/1/2020 Jan 1 35 86
1/7/2020 Jan 5 15 84
1/1/2021 Jan 14 5 82
I have created a list of some of the columns I would like to transform (about 50 columns):
l = ['Feature1','Feature2',......'Feature58']
For each of columns names found in l
I would like to create a weighted average of each persons last 20 entries (weighted to the most recent) and shifted by 1 (because I hope to use it as a predicition feature).
Date Person Feature1 Feature2 .... Feature100 Feature1_Shifted Feature2_Shifted ... Feature58_Shifted
1/1/2020 Jim 12 15 82 N/A N/A
1/7/2020 Jim 1 25 84 12 15
1/1/2021 Jim 12 15 85 6.5 20
1/1/2020 Jan 1 35 86 N/A N/A
1/7/2020 Jan 5 15 84 1 35
1/1/2021 Jan 14 5 82 3 25
Loosly based off this quesition here: Most Pythonic Way to Create Many New Columns in Pandas
and for the weighted average here: https://stackoverflow.com/a/57602282/13194245 with the option of changing the weightings.
I am struggling to combine everything into one (and where to start). So any help would be greatly appreciated. Thanks!