Create a ID value based on an incremental value when a value in a column changes in R

Question

I would like to create a 'segment' ID so that:

If the value (in one column) is the same as the row before you maintain the same segment ID
However, if the value (in one column) is different than the row before the segment ID increments by one

I am currently trying to achieve this via:

require(dplyr)
person <- c("Mark","Mark","Mark","Mark","Mark","Steve","Steve","Tim", "Tim", "Tim","Mark")
df <- data.frame(person,stringsAsFactors = FALSE)
df$segment = 1

df$segment <- ifelse(df$person == dplyr::lag(df$person),dplyr::lag(df$segment),dplyr::lag(df$segment)+1)

But I am not getting the desired result through this method.

Any help would be appreciated

So what's the desired output? Does the last Mark get the same value as the first Marks? — MrFlick, Mar 09 '17 at 21:15

score 4 · Answer 1 · answered Mar 09 '17 at 21:18

If you want to increment on change, try this

df %>% mutate(segment = cumsum(person != lag(person, default="")))
#    person segment
# 1    Mark       1
# 2    Mark       1
# 3    Mark       1
# 4    Mark       1
# 5    Mark       1
# 6   Steve       2
# 7   Steve       2
# 8     Tim       3
# 9     Tim       3
# 10    Tim       3
# 11   Mark       4

score 1 · Answer 2 · answered Mar 09 '17 at 21:17

1

A base R solution might look like this

c(1, cumsum(person[-1] != person[-length(person)]) +1)
[1] 1 1 1 1 1 2 2 3 3 3 4

answered Mar 09 '17 at 21:17

G5W

36,531
10
47
80

Create a ID value based on an incremental value when a value in a column changes in R

2 Answers2

Linked