4

I am trying to calculate some statistics from a pandas dataframe. It looks something like this:

id     value     conditional
1      10        0
2      20        0
3      30        1
1      15        1
3      5         0
1      10        1

So, I need to calculate the cumulative sum of the column value for each id from top to botom, but only when conditional is 1.

So, this should give me something like:

id     value     conditional   cumulative sum
1      10        0             0
2      20        0             0
3      30        1             30
1      15        1             15
3      5         0             30
1      10        1             25

So, the sum of id=1 is taken only when conditional=1 in the 4th and 6th row and the 1st row value is not counted. How do I do this in pandas?

sfactor
  • 12,592
  • 32
  • 102
  • 152

1 Answers1

12

You can create a Series that is the multiplication of value and conditional, and take the cumulative sum of it for each id group:

df['cumsum'] = (df['value']*df['conditional']).groupby(df['id']).cumsum()
df
Out: 
   id  value  conditional  cumsum
0   1     10            0       0
1   2     20            0       0
2   3     30            1      30
3   1     15            1      15
4   3      5            0      30
5   1     10            1      25
ayhan
  • 70,170
  • 20
  • 182
  • 203