3

I have a daraframe as below:

        Datetime     Data      Fn
0   18747.385417  11275.0       0
1   18747.388889   8872.0       1
2   18747.392361   7050.0       0
3   18747.395833   8240.0       1
4   18747.399306   5158.0       1
5   18747.402778   3926.0       0
6   18747.406250   4043.0       0
7   18747.409722   2752.0       1
8   18747.420139   3502.0       1
9   18747.423611   4026.0       1

I want to calculate the sum of continious non zero values of Column (Fn)

I want my result dataframe as below:

        Datetime     Data      Fn     Sum
0   18747.385417  11275.0       0       0
1   18747.388889   8872.0       1       1
2   18747.392361   7050.0       0       0
3   18747.395833   8240.0       1       1
4   18747.399306   5158.0       1       2  <<<
5   18747.402778   3926.0       0       0
6   18747.406250   4043.0       0       0
7   18747.409722   2752.0       1       1
8   18747.420139   3502.0       1       2
9   18747.423611   4026.0       1       3
Rohit Lamba K
  • 510
  • 1
  • 4
  • 11
  • 1
    Just by reading the title, adding a zero to a sum doesn't change the sums value. That said, I was able to understand your question after some time. I think the word sequence(seq) is better for what you call sum here. – The Fool May 01 '21 at 04:19

4 Answers4

7

You can use groupby() and cumsum():

groups = df.Fn.eq(0).cumsum()
df['Sum'] = df.Fn.ne(0).groupby(groups).cumsum()

Details

First use df.Fn.eq(0).cumsum() to create pseudo-groups of consecutive non-zeros. Each zero will get a new id while consecutive non-zeros will keep the same id:

groups = df.Fn.eq(0).cumsum()

#    groups  Fn (Fn added just for comparison)
# 0       1   0
# 1       1   1
# 2       2   0
# 3       2   1
# 4       2   1
# 5       3   0
# 6       4   0
# 7       4   1
# 8       4   1
# 9       4   1

Then group df.Fn.ne(0) on these pseudo-groups and cumsum() to generate the within-group sequences:

df['Sum'] = df.Fn.ne(0).groupby(groups).cumsum()

#        Datetime     Data  Fn  Sum
# 0  18747.385417  11275.0   0    0
# 1  18747.388889   8872.0   1    1
# 2  18747.392361   7050.0   0    0
# 3  18747.395833   8240.0   1    1
# 4  18747.399306   5158.0   1    2
# 5  18747.402778   3926.0   0    0
# 6  18747.406250   4043.0   0    0
# 7  18747.409722   2752.0   1    1
# 8  18747.420139   3502.0   1    2
# 9  18747.423611   4026.0   1    3
tdy
  • 36,675
  • 19
  • 86
  • 83
3

How about using cumsum and reset when value is 0

df['Fn2'] = df['Fn'].replace({0: False, 1: True})
df['Fn2'] = df['Fn2'].cumsum() - df['Fn2'].cumsum().where(df['Fn2'] == False).ffill().astype(int)
df
0

You can store the fn column in a list and then create a new list and iterate over the stored fn column and check the previous index value if it is greater than zero then add it to current index else do not update it and after this u can make a dataframe for the list and concat column wise to existing dataframe

fn=df[Fn]
sum_list[0]=fn first value    
for i in range(1,lenghtofthe column):    
   if fn[i-1]>0:
      sum_list.append(fn[i-1]+fn[i])    
   else:
      sum_list.append(fn[i])
dfsum=pd.Dataframe(sum_list)
df=pd.concat([df,dfsum],axis=1)

Hope this will help you.there may me syntax errors that you can refer google.But the idea is this

0

try this:

sum_arr = [0]

for val in df['Fn']:
    if val > 0:
        sum_arr.append(sum_arr[-1] + 1)
    else:
        sum_arr.append(0)

df['sum'] = sum_arr[1:]

df
Gusti Adli
  • 1,225
  • 4
  • 13