1

Below is a snapshot of my data frame, and I have a list of ID from 1 to N, with outcomes 1 or 0. What I am trying to do is to generate the Label and csum columns basic on the occurrence of the outcome in Python, how do I achieve this and loop through N Ids?

Label: the past 3 days when outcome 1 occur

csum: Keep Counting and sum 1 until 1 occurs, then reset again

My Dataset:

ID Date Outcome
ID_1 2021-01-01 0
ID_1 2021-01-02 0
ID_1 2021-01-03 0
ID_1 2021-01-04 0
ID_1 2021-01-05 0
ID_1 2021-01-06 0
ID_1 2021-01-07 0
ID_1 2021-01-08 1
ID_1 2021-01-09 0
ID_1 2021-01-10 0
ID_1 2021-01-11 0
ID_2 2021-01-01 0
ID_2 2021-01-02 0
ID_2 2021-01-03 1
ID_2 2021-01-04 0
ID_2 2021-01-05 0
ID_2 2021-01-06 0
ID_2 2021-01-07 0
ID_2 2021-01-08 0
ID_2 2021-01-09 1
ID_2 2021-01-10 0
ID_2 2021-01-11 0

My Target Outcome:

ID Date Outcome Label Csum
ID_1 2021-01-01 0 0 1
ID_1 2021-01-02 0 0 2
ID_1 2021-01-03 0 0 3
ID_1 2021-01-04 0 0 4
ID_1 2021-01-05 0 0 5
ID_1 2021-01-06 0 lab1 6
ID_1 2021-01-07 0 lab1 7
ID_1 2021-01-08 1 lab1 8
ID_1 2021-01-09 0 0 1
ID_1 2021-01-10 0 0 2
ID_1 2021-01-11 0 0 3
ID_2 2021-01-01 0 lab1 1
ID_2 2021-01-02 0 lab1 2
ID_2 2021-01-03 1 lab1 3
ID_2 2021-01-04 0 0 1
ID_2 2021-01-05 0 0 2
ID_2 2021-01-06 0 0 3
ID_2 2021-01-07 0 lab1 4
ID_2 2021-01-08 0 lab1 5
ID_2 2021-01-09 1 lab1 6
ID_2 2021-01-10 0 0 1
ID_2 2021-01-11 0 0 2
kata1000
  • 60
  • 4

1 Answers1

1

IIUC, you can use:

# identify the rows with a 1 in the next 3 rows (per group)
m = df[::-1].groupby('ID').rolling(3, min_periods=1)['Outcome'].max().droplevel(0)

# identify the rows where 0 restarts
df['Label'] = m.mask(m.eq(1), 'lab1')
group = (df['Label'].eq(0)&df['Label'].ne(df['Label'].shift())).cumsum()

# compute cumcount
df['Csum'] = df.groupby(['ID', group]).cumcount().add(1)

output:

      ID        Date  Outcome Label  Csum
0   ID_1  2021-01-01        0   0.0     1
1   ID_1  2021-01-02        0   0.0     2
2   ID_1  2021-01-03        0   0.0     3
3   ID_1  2021-01-04        0   0.0     4
4   ID_1  2021-01-05        0   0.0     5
5   ID_1  2021-01-06        0  lab1     6
6   ID_1  2021-01-07        0  lab1     7
7   ID_1  2021-01-08        1  lab1     8
8   ID_1  2021-01-09        0   0.0     1
9   ID_1  2021-01-10        0   0.0     2
10  ID_1  2021-01-11        0   0.0     3
11  ID_2  2021-01-01        0  lab1     1
12  ID_2  2021-01-02        0  lab1     2
13  ID_2  2021-01-03        1  lab1     3
14  ID_2  2021-01-04        0   0.0     1
15  ID_2  2021-01-05        0   0.0     2
16  ID_2  2021-01-06        0   0.0     3
17  ID_2  2021-01-07        0  lab1     4
18  ID_2  2021-01-08        0  lab1     5
19  ID_2  2021-01-09        1  lab1     6
20  ID_2  2021-01-10        0   0.0     1
21  ID_2  2021-01-11        0   0.0     2
mozway
  • 194,879
  • 13
  • 39
  • 75