1

I would like to create a 'segment' ID so that:

  1. If the value (in one column) is the same as the row before you maintain the same segment ID
  2. However, if the value (in one column) is different than the row before the segment ID increments by one

I am currently trying to achieve this via:

require(dplyr)
person <- c("Mark","Mark","Mark","Mark","Mark","Steve","Steve","Tim", "Tim", "Tim","Mark")
df <- data.frame(person,stringsAsFactors = FALSE)
df$segment = 1

df$segment <- ifelse(df$person == dplyr::lag(df$person),dplyr::lag(df$segment),dplyr::lag(df$segment)+1)

But I am not getting the desired result through this method.

Any help would be appreciated

DenJJ
  • 404
  • 4
  • 16

2 Answers2

4

If you want to increment on change, try this

df %>% mutate(segment = cumsum(person != lag(person, default="")))
#    person segment
# 1    Mark       1
# 2    Mark       1
# 3    Mark       1
# 4    Mark       1
# 5    Mark       1
# 6   Steve       2
# 7   Steve       2
# 8     Tim       3
# 9     Tim       3
# 10    Tim       3
# 11   Mark       4
MrFlick
  • 195,160
  • 17
  • 277
  • 295
1

A base R solution might look like this

c(1, cumsum(person[-1] != person[-length(person)]) +1)
[1] 1 1 1 1 1 2 2 3 3 3 4
G5W
  • 36,531
  • 10
  • 47
  • 80