1

I once used a very handy r function that could return a vector whose value were incremented at each change in another variable. Here would be my input data.frame, generated with the following code:

set.seed(0) data.frame(Day=seq(as.Date("2016-10-01"),as.Date("2016-10-10"),by="day"), bit=sample(c(0,1),size=10,replace=TRUE))

Day          bit
2016-10-01   1
2016-10-02   0
2016-10-03   0
2016-10-04   1
2016-10-05   1
2016-10-06   0
2016-10-07   1
2016-10-08   1
2016-10-09   1
2016-10-10   1

I would need the extra column bit.change

Day          bit bit.change
2016-10-01   1   1
2016-10-02   0   2
2016-10-03   0   2
2016-10-04   1   3
2016-10-05   1   3
2016-10-06   0   4
2016-10-07   1   5
2016-10-08   1   5
2016-10-09   1   5
2016-10-10   1   5

I have a solution comparing bit with its lagged value but it is not elegant. Does anybody know what function i mean? If possible, it should work with dplyr::mutate() and dplyr::group_by(). If group_by() is specified the bit.change should start again from 1 at each new group. Thx a lot for your help!

Vincent
  • 353
  • 2
  • 7
  • See `rleid()` in `data.table` for one such implementation of this. I'm sure this is a duplicate question but `cumsum(abs(c(1,diff(df$bit))))` will also do it in base R. – thelatemail Nov 17 '16 at 00:08
  • Thx. `rleid()` was what I was looking for. Thx also for the alternative suggestion with `cumsum()` – Vincent Nov 17 '16 at 09:37

1 Answers1

3
df %>%
  mutate(bit.change=cumsum(c(1, diff(bit) != 0)))

          Day bit bit.change
1  2016-10-01   1          1
2  2016-10-02   0          2
3  2016-10-03   0          2
4  2016-10-04   1          3
5  2016-10-05   1          3
6  2016-10-06   0          4
7  2016-10-07   1          5
8  2016-10-08   1          5
9  2016-10-09   1          5
10 2016-10-10   1          5
JasonWang
  • 2,414
  • 11
  • 12