I am looking to add a new column to my current data frame which adds a new sequencing number based on a string of events in a football match.
This my current data frame
head(test_P)
index team.name possession_team.name minute second period possession type.name
1 5 Cardiff City Cardiff City 0 0 1 2 Pass
2 6 Cardiff City Cardiff City 0 2 1 2 Ball Receipt*
3 7 Cardiff City Cardiff City 0 2 1 2 Carry
4 8 Cardiff City Cardiff City 0 3 1 2 Pass
5 9 Cardiff City Cardiff City 0 6 1 2 Ball Receipt*
6 10 Preston North End Cardiff City 0 6 1 2 Duel
7 11 Preston North End Cardiff City 0 6 1 2 Pass
8 12 Preston North End Cardiff City 0 8 1 2 Miscontrol
9 13 Cardiff City Cardiff City 0 8 1 2 Pass
10 14 Cardiff City Cardiff City 0 9 1 2 Ball Receipt*
11 15 Cardiff City Cardiff City 0 9 1 2 Cross
12 16 Preston North End Cardiff City 0 10 1 2 Clearance
13 17 Cardiff City Cardiff City 0 11 1 2 Pass
14 18 Cardiff City Cardiff City 0 13 1 2 Ball Receipt*
15 19 Preston North End Preston North End 0 13 1 3 Ball Recovery
16 20 Preston North End Preston North End 0 13 1 3 Carry
17 21 Preston North End Preston North End 0 21 1 3 Pass
18 22 Preston North End Preston North End 0 22 1 3 Ball Receipt*.
However, I want to add an additional column name after possession called sequence which labels the sequence number of a possession.
Every new possession should start with sequence with value of 1
But if the opposition breaks that sequence with an event/events and the possession value is still the same, the next time the possession team touches the ball it should be a new sequence number e.g 2 or if multiple breaks 3,4 etc
The opposition events should be coded with the same sequence number as the one to which they have broke
E.g Data Below
index team.name possession_team.name minute second period possession type.name sequence
1 5 Cardiff City Cardiff City 0 0 1 2 Pass 1
2 6 Cardiff City Cardiff City 0 2 1 2 Ball Receipt 1
3 7 Cardiff City Cardiff City 0 2 1 2 Carry 1
4 8 Cardiff City Cardiff City 0 3 1 2 Pass 1
5 9 Cardiff City Cardiff City 0 6 1 2 Ball Receipt* 1
6 10 Preston North End Cardiff City 0 6 1 2 Duel 1
7 11 Preston North End Cardiff City 0 6 1 2 Pass 1
8 12 Preston North End Cardiff City 0 8 1 2 Miscontrol 1
9 13 Cardiff City Cardiff City 0 8 1 2 Pass 2
10 14 Cardiff City Cardiff City 0 9 1 2 Ball Receipt 2
11 15 Cardiff City Cardiff City 0 9 1 2 Cross 2
12 16 Preston North End Cardiff City 0 10 1 2 Clearance 2
13 17 Cardiff City Cardiff City 0 11 1 2 Pass 3
14 18 Cardiff City Cardiff City 0 13 1 2 Ball Receipt 3
15 19 Preston North End Preston North End 0 13 1 3 Ball Recovery 1
16 20 Preston North End Preston North End 0 13 1 3 Carry 1
17 21 Preston North End Preston North End 0 21 1 3 Pass 1
18 22 Preston North End Preston North End 0 22 1 3 Ball Receipt 1
I have tried lead and lag functions combined with ifelse statements but can't seem to get the data to work
test <- test %>% mutate(P = ifelse(dplyr::lag(team.id)!=team.id & dplyr::lag(possession) == possession, dplyr::lag(seq_id) + 1,
ifelse(dplyr::lead(team.id)!=team.id & dplyr::lead(possession)!=possession , seq_id, 1)))
Any help would be greatly appreciated and apologies for the untidiness of this question