3

I would like to create a running count of the values in the "Button" column (Start, Stop) within each "ID". However, any change in the "Button" value or a change of "ID" should reset the running count. The data frame is below:

data = pd.DataFrame({
'ID': ['A','A','B','B','C','C','C','C','C','D','E','E'], 
'Button': ['Start','Stop','Start','Stop','Start','Start','Stop','Start','Stop','Start','Start','Stop']
})

I can create a running count based on the "Button" values, but can't figure out how to group that by "ID".

data['runningCount'] = data.groupby(data['Button']).cumcount()+1

I'm looking for the following result:

result = pd.DataFrame({
'ID': ['A','A','B','B','C','C','C','C','C','D','E','E'], 
'Button': ['Start','Stop','Start','Stop','Start','Start','Stop','Start','Stop','Start','Start','Stop'],
'Count': [1,1,1,1,1,2,1,1,1,1,1,1]})
cottontail
  • 10,268
  • 18
  • 50
  • 51
Dfeld
  • 187
  • 9

1 Answers1

3

You just need to create a specific key with shift

s=data.groupby('ID').Button.apply(lambda x : (x!=x.shift()).cumsum())
data.groupby([data.ID,s]).cumcount()+1
Out[189]: 
0     1
1     1
2     1
3     1
4     1
5     2
6     1
7     1
8     1
9     1
10    1
11    1
dtype: int64

More info

data.groupby('ID').Button.apply(lambda x : (x!=x.shift()).cumsum())
Out[192]: 
0     1
1     2
2     1
3     2
4     1
5     1
6     2
7     3
8     4
9     1
10    1
11    2
Name: Button, dtype: int32
BENY
  • 317,841
  • 20
  • 164
  • 234