I have the following code that for a sorted Pandas data frame, groups by one column, and creates two new columns: one according to the previous 4 rows and current row in the group, and one based on the future row in the group.
data_test = {'nr':[1,1,1,1,1,6,6,6,6,6,6,6],
'val':[11,12,13,14,15,61,62,63,64,65,66,67]}
df_test = pd.DataFrame (data_test, columns = ['nr','val'])
print (df_test)
hence the following frame:
nr val
0 1 11
1 1 12
2 1 13
3 1 14
4 1 15
5 6 61
6 6 62
7 6 63
8 6 64
9 6 65
10 6 66
11 6 67
Now I have to following code which groups by 'nr' and build one column containing for each row previous 4 values of 'val' in the group and the current value. Similarly is build one extra column containing per row the future value of 'val' in the group.
df_test['past4'] = df_test.groupby(['nr'])['val'].transform(lambda x: x.shift(4).fillna(0))
df_test['past3'] = df_test.groupby(['nr'])['val'].transform(lambda x: x.shift(3).fillna(0))
df_test['past2'] = df_test.groupby(['nr'])['val'].transform(lambda x: x.shift(2).fillna(0))
df_test['past1'] = df_test.groupby(['nr'])['val'].transform(lambda x: x.shift(1).fillna(0))
df_test['future'] = df_test.groupby(['nr'])['val'].transform(lambda x: x.shift(-1).fillna(0))
df_test['amounts'] = df_test[['past4', 'past3','past2','past1','val']].values.tolist()
df_test.drop(columns = ['past4', 'past3', 'past2', 'past1'], inplace = True)
df_test
nr val future amounts
0 1 11 12 [0, 0, 0, 0, 11]
1 1 12 13 [0, 0, 0, 11, 12]
2 1 13 14 [0, 0, 11, 12, 13]
3 1 14 15 [0, 11, 12, 13, 14]
4 1 15 0 [11, 12, 13, 14, 15]
5 6 61 62 [0, 0, 0, 0, 61]
6 6 62 63 [0, 0, 0, 61, 62]
7 6 63 64 [0, 0, 61, 62, 63]
8 6 64 65 [0, 61, 62, 63, 64]
9 6 65 66 [61, 62, 63, 64, 65]
10 6 66 67 [62, 63, 64, 65, 66]
11 6 67 0 [63, 64, 65, 66, 67]
I'm sure I should be able to build the one list column called 'amounts' easier, probably one-liner. How can I do this?