Here is my input:
import pandas as pd
import numpy as np
list1 = [10,79,6,38,4,557,12,220,46,22,45,22]
list2 = [4,3,23,6,234,47,312,2,426,42,435,23]
df = pd.DataFrame({'A' : list1, 'B' : list2}, columns = ['A', 'B'])
df['C'] = np.where (df['A'] > df['B'].shift(-2), 1, np.nan)
print (df)
that produces this output:
A B C
0 10 4 NaN
1 79 3 1.0
2 6 23 NaN
3 38 6 NaN
4 4 234 NaN
5 557 47 1.0
6 12 312 NaN
7 220 2 1.0
8 46 426 NaN
9 22 42 NaN
10 45 435 NaN
11 22 23 NaN
What I need to do is to change column 'C' to be a set of three 1's in a row, non-overlapping. The desired output is:
A B C
0 10 4 NaN
1 79 3 1.0
2 6 23 1.0
3 38 6 1.0
4 4 234 NaN
5 557 47 1.0
6 12 312 1.0
7 220 2 1.0
8 46 426 NaN
9 22 42 NaN
10 45 435 NaN
11 22 23 NaN
So, rows 2, 3, and 6 change from NaN to 1.0. Row 7 already has a 1.0 and it is ignored. Rows 8 and 9 need to stay NaN because row 7 is the last entry of the previous set.
I don't know if there is a better way to build column 'C' that would do this at creation.
I have tried several versions of fillna and ffill, none of them worked for me.
It seems very convoluted but I tried to isolate the row id's for each 1.0 with this line:
print (df.loc[df['C'] == 1])
Which correctly outputs this:
A B C
1 79 3 1.0
5 557 47 1.0
7 220 2 1.0
Even though I know that information, I don't know how to proceed from there.
Thank you so much for your help in advance, David