I have the following dataframe and I am trying to label an entire block with a number which is based on how many similar blocks has been seen upto now based on class column. Consecutive class value is given the same number. If the same class block comes later, the number will be incremented. If some new class block comes, then it is initialized to 1
.
df = DataFrame(zip(range(10,30), range(20)), columns = ['a','b'])
df['Class'] = [np.nan, np.nan, np.nan, np.nan, 'a', 'a', 'a', 'a', np.nan, np.nan,'a', 'a', 'a', 'a', 'a', np.nan, np.nan, 'b', 'b','b']
a b Class
0 10 0 NaN
1 11 1 NaN
2 12 2 NaN
3 13 3 NaN
4 14 4 a
5 15 5 a
6 16 6 a
7 17 7 a
8 18 8 NaN
9 19 9 NaN
10 20 10 a
11 21 11 a
12 22 12 a
13 23 13 a
14 24 14 a
15 25 15 NaN
16 26 16 NaN
17 27 17 b
18 28 18 b
19 29 19 b
Sample output looks like this:
a b Class block_encounter_no
0 10 0 NaN NaN
1 11 1 NaN NaN
2 12 2 NaN NaN
3 13 3 NaN NaN
4 14 4 a 1
5 15 5 a 1
6 16 6 a 1
7 17 7 a 1
8 18 8 NaN NaN
9 19 9 NaN NaN
10 20 10 a 2
11 21 11 a 2
12 22 12 a 2
13 23 13 a 2
14 24 14 a 2
15 25 15 NaN NaN
16 26 16 NaN NaN
17 27 17 b 1
18 28 18 b 1
19 29 19 b 1