How to create an ID that increases by 1 every time the previous row of another column is 1

Question

Working with Python, I need to create two new variables.

One (See JourneyID in example) that cummulatively increases by one each time the previous row of another column takes the value '1', and

One (See JourneyN in example) that cummulatively increases by one each time the previous row of another column takes the value '1', but starts over from 1 every time the Respondent ID increases by 1.

m = df['Purpose'] == 1
df.loc[m, 'JourneyID'] = m.cumsum()

Returns df[JourneyID] = [1,1,1,2,1,1,3,1,4] when it should return [1,1,2,2,3,3,3,4,4] for ID.

Any help is greatly appreciated.

This might be something you would know how to answer, @yatu . I corrected the confusing example. — nielsen, Apr 15 '20 at 12:26

sltzgs · Accepted Answer · 2020-04-15T13:03:14.083

1

Its not super clean, but should get you what you need:

helper = ((df['Purpose']==1).cumsum()+1).shift(1)
helper[0]=1
df['JourneyID'] =  helper

JourneyN I did not fully understand :)

edited Apr 15 '20 at 13:03

answered Apr 15 '20 at 12:52

sltzgs

166
6

Unfortunately, this solution has the same problem that I got. I hoped taking a look at the table, that I made could give an idea of what it should look like. – nielsen Apr 15 '20 at 12:56

How to create an ID that increases by 1 every time the previous row of another column is 1

1 Answers1