Below is a snapshot of my data frame, and I have a list of ID from 1 to N, with outcomes 1 or 0. What I am trying to do is to generate the Label and csum columns basic on the occurrence of the outcome in Python, how do I achieve this and loop through N Ids?
Label: the past 3 days when outcome 1 occur
csum: Keep Counting and sum 1 until 1 occurs, then reset again
My Dataset:
ID | Date | Outcome |
---|---|---|
ID_1 | 2021-01-01 | 0 |
ID_1 | 2021-01-02 | 0 |
ID_1 | 2021-01-03 | 0 |
ID_1 | 2021-01-04 | 0 |
ID_1 | 2021-01-05 | 0 |
ID_1 | 2021-01-06 | 0 |
ID_1 | 2021-01-07 | 0 |
ID_1 | 2021-01-08 | 1 |
ID_1 | 2021-01-09 | 0 |
ID_1 | 2021-01-10 | 0 |
ID_1 | 2021-01-11 | 0 |
ID_2 | 2021-01-01 | 0 |
ID_2 | 2021-01-02 | 0 |
ID_2 | 2021-01-03 | 1 |
ID_2 | 2021-01-04 | 0 |
ID_2 | 2021-01-05 | 0 |
ID_2 | 2021-01-06 | 0 |
ID_2 | 2021-01-07 | 0 |
ID_2 | 2021-01-08 | 0 |
ID_2 | 2021-01-09 | 1 |
ID_2 | 2021-01-10 | 0 |
ID_2 | 2021-01-11 | 0 |
My Target Outcome:
ID | Date | Outcome | Label | Csum |
---|---|---|---|---|
ID_1 | 2021-01-01 | 0 | 0 | 1 |
ID_1 | 2021-01-02 | 0 | 0 | 2 |
ID_1 | 2021-01-03 | 0 | 0 | 3 |
ID_1 | 2021-01-04 | 0 | 0 | 4 |
ID_1 | 2021-01-05 | 0 | 0 | 5 |
ID_1 | 2021-01-06 | 0 | lab1 | 6 |
ID_1 | 2021-01-07 | 0 | lab1 | 7 |
ID_1 | 2021-01-08 | 1 | lab1 | 8 |
ID_1 | 2021-01-09 | 0 | 0 | 1 |
ID_1 | 2021-01-10 | 0 | 0 | 2 |
ID_1 | 2021-01-11 | 0 | 0 | 3 |
ID_2 | 2021-01-01 | 0 | lab1 | 1 |
ID_2 | 2021-01-02 | 0 | lab1 | 2 |
ID_2 | 2021-01-03 | 1 | lab1 | 3 |
ID_2 | 2021-01-04 | 0 | 0 | 1 |
ID_2 | 2021-01-05 | 0 | 0 | 2 |
ID_2 | 2021-01-06 | 0 | 0 | 3 |
ID_2 | 2021-01-07 | 0 | lab1 | 4 |
ID_2 | 2021-01-08 | 0 | lab1 | 5 |
ID_2 | 2021-01-09 | 1 | lab1 | 6 |
ID_2 | 2021-01-10 | 0 | 0 | 1 |
ID_2 | 2021-01-11 | 0 | 0 | 2 |