I have a table similar to this (without the attempt column):
user_id | status | date | attempt |
---|---|---|---|
1 | completed | 5/4/2022 | 1 |
2 | completed | 5/5/2022 | 1 |
1 | failed | 5/6/2022 | 1 |
1 | failed | 5/7/2022 | 2 |
2 | completed | 5/8/2022 | 1 |
1 | completed | 5/9/2022 | 3 |
2 | failed | 5/10/2022 | 1 |
2 | completed | 5/11/2022 | 2 |
2 | completed | 5/12/2022 | 1 |
2 | completed | 5/13/2022 | 1 |
I'm trying to figure out how to create the "attempt" column. Basically the attempt number should reset to 0 on a user's next attempt a "completed" status, specific to user_id. The example is small but the actual dataset I'm working with is large so I can't do anything super computationally intensive. I have access to the data in SQL or Python, whichever is easier
I was able to get a correct "attempt" column for a single user with the answer from this question: Count rows with positive values and reset if negative but it didn't work when the dataframe contained multiple users.
I haven't found anything that works in SQL either. This was my most recent effort: textquestion:Attempt number over transitory statuses in T-SQL