3

I'm trying to change this

data.frame(id=c(1,1,1,1,1,2,2), val=c('a','a','b','a','a','a','b'))
  id val
1  1   a
2  1   a
3  1   b
4  1   a
5  1   a
6  2   a
7  2   b

into

  id val
1  1   1
2  1   1
3  1   2
4  1   3
5  1   3
6  2   1
7  2   2

For each id, the value of val begins with 1 and increases by 1 when val changes.

milan
  • 4,782
  • 2
  • 21
  • 39

4 Answers4

5

One dplyr possibility could be:

df %>%
 group_by(id) %>%
 mutate(val = with(rle(as.numeric(val)), rep(seq_along(lengths), lengths)))

     id   val
  <dbl> <int>
1     1     1
2     1     1
3     1     2
4     1     3
5     1     3
6     2     1
7     2     2

The same idea using rleid() from data.table:

df %>%
 group_by(id) %>%
 mutate(val = rleid(val))
tmfmnk
  • 38,881
  • 4
  • 47
  • 67
  • 1
    For folks who don't want to load data.table alongside dplyr (eg because of namespace conflicts): https://stackoverflow.com/q/33507868 – Frank Mar 20 '19 at 16:35
2

With dplyr we can group_by id and increment the counter every time the next value changes from the current one.

library(dplyr)

df %>%
  group_by(id) %>%
  mutate(val = cumsum(val != lag(val, default = TRUE)))

#     id   val
#   <dbl> <dbl>
#1     1     1
#2     1     1
#3     1     2
#4     1     3
#5     1     3
#6     2     1
#7     2     2
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
2

Here is a base R solution.

f <- as.integer(as.factor(df1$val))
1 + ave(f, df1$id, FUN = function(x) cumsum(c(0, diff(x) != 0)))
#[1] 1 1 2 3 3 1 2

Then assign the result to df1$val.

Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
2

data.table's rleid() is made for this:

library(data.table)
setDT(xy)
xy[, rleid(val), id]

   id V1
1:  1  1
2:  1  1
3:  1  2
4:  1  3
5:  1  3
6:  2  1
7:  2  2
s_baldur
  • 29,441
  • 4
  • 36
  • 69