Take the following df
:
df = pd.DataFrame({'col_1':['A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'C'],
'val_1':[10, 20, 15, 25, 20, 15, 20, 30, 40],
'val_2':[-5, 0, 5, 10, 5, 10, 15, 10, 5]})
col_1 val_1 val_2
0 A 10 -5
1 B 20 0
2 C 15 5
3 A 25 10
4 B 20 5
5 C 15 10
6 A 20 15
7 B 30 10
8 C 40 5
I need to create a new column with the inverted sign of val_2
, if col_1 == 'A'
and if val_2 > 0
. This is straightforward:
df['new_col'] = df.apply(lambda x: -x.val_2 if x.col_1 == 'A' and x.val_2 > 0 else 0, axis=1)
col_1 val_1 val_2 new_col
0 A 10 -5 0
1 B 20 0 0
2 C 15 5 0
3 A 25 10 -10
4 B 20 5 0
5 C 15 10 0
6 A 20 15 -15
7 B 30 10 0
8 C 40 5 0
But my final goal is just to have the incremental value in new_col
, that is, the value of row(n) minus the cumulative sum of above rows.
The resulting df
should look like this:
col_1 val_1 val_2 new_col
0 A 10 -5 0
1 B 20 0 0
2 C 15 5 0
3 A 25 10 -10
4 B 20 5 0
5 C 15 10 0
6 A 20 15 -5 --> -15 - (-10) --> (-10 is the cumsum of previous rows)
7 B 30 10 0
8 C 40 5 0