0

I want to drop queried dataframe - rows and replace it with new data when the index are next to each other.

basically adding value i-1 to i.

is this possible to be done ?

please see the example data below: if i have a data frame as above: i want to amend the dataframe as below


import pandas as pd 

training_data = pd.DataFrame()

training_data['a'] = [1,1,2,2,2,3,3,3,4,4,5,6,7]
training_data['b'] = [1,1,2,2,2,3,3,3,4,4,5,6,7]

training_data['c'] = [1,1,2,2,2,3,3,3,4,4,5,6,7]
training_data['condition'] = [True,True,False,False,True,True,False,False,True,False,False,False,True]


True_data = training_data[training_data['condition'] == True]


True_data:

index    a    b    c    condition   

0       1     1    1      True      

1       1     1    1      True 

4       2     2    2      True

5       3     3    3      True

8       4     4    4      True 

12      7     7    7      True 


desired output:

index    a    b    c    condition   


new     2     2    2      True 

new     5     5    5      True 

8       4     4    4      True

12      7     7    7      True 

All added values have index next to each other however,8, 12 does not thus will not be added.

Thank you for all the help.

bmaster69
  • 121
  • 9

1 Answers1

1

Try:

grp = (~training_data['condition']).cumsum()
training_data.query('condition')\
             .groupby(grp)\
             .agg({'a':'sum','b':'sum','c':'sum','condition':'first'})

Output:

           a  b  c  condition
condition                    
0          2  2  2       True
2          5  5  5       True
4          8  8  8       True
6          7  7  7       True

Updated with new data:

training_data['grp'] =  (~training_data['condition']).cumsum()

training_data.query('condition').groupby('grp').agg({'a':'sum','b':'sum','c':'sum','condition':'first'})

Output:

     a  b  c  condition
grp                    
0    2  2  2       True
2    5  5  5       True
4    4  4  4       True
7    7  7  7       True
Scott Boston
  • 147,308
  • 15
  • 139
  • 187