3

I've got a data frame containing values relating to observations, 1 or 0. I want to count the continual occurrences of 1, resetting at 0. The run length encoding function (rle) seems like it would do the work but I can't work out getting the data into the desired format. I want to try doing this without writing a custom function. In the data below, I have observation in a data frame, then I want to derive the "continual" column and write back to the dataframe. This link was a good start.

observation continual 
          0         0
          0         0
          0         0
          1         1
          1         2
          1         3
          1         4
          1         5
          1         6
          1         7
          1         8
          1         9
          1        10
          1        11
          1        12
          0         0
          0         0
Community
  • 1
  • 1

5 Answers5

10

You can do this pretty easily in a couple of steps:

x <- rle(mydf$observation)       ## run rle on the relevant column
new <- sequence(x$lengths)       ## create a sequence of the lengths values
new[mydf$observation == 0] <- 0  ## replace relevant values with zero
new
#  [1]  0  0  0  1  2  3  4  5  6  7  8  9 10 11 12  0  0
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
7

Using the devel version you could try

library(data.table) ## v >= 1.9.5
setDT(df)[, continual := seq_len(.N) * observation, by = rleid(observation)]
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
ExperimenteR
  • 4,453
  • 1
  • 15
  • 19
5

There is probably a better way, but:

g <- c(0,cumsum(abs(diff(df$obs))))
df$continual <- ave(g,g,FUN=seq_along)
df$continual[df$obs==0] <- 0
Frank
  • 66,179
  • 8
  • 96
  • 180
3

Simply adapting the accepted answer from the question you linked:

unlist(mapply(function(x, y) seq(x)*y, rle(df$obs)$lengths, rle(df$obs)$values))
# [1]  0  0  0  1  2  3  4  5  6  7  8  9 10 11 12 0  0
Jota
  • 17,281
  • 7
  • 63
  • 93
2

You can use a simple base R one liner, using the fact observation contains only 0 and 1 , coupled with a vectorized operation:

transform(df, continual=ifelse(observation, cumsum(observation), observation))

#   observation continual
#1            0         0
#2            0         0
#3            0         0
#4            1         1
#5            1         2
#6            1         3
#7            1         4
#8            1         5
#9            1         6
#10           1         7
#11           1         8
#12           1         9
#13           1        10
#14           1        11
#15           1        12
#16           0         0
#17           0         0
Colonel Beauvel
  • 30,423
  • 11
  • 47
  • 87
  • 1
    Does this work if there are additional runs of ones? It's sort of vague, but I'm thinking of " I want to count the continual occurrences of 1, resetting at 0" – Frank May 15 '15 at 04:11
  • 2
    True! I would do `sequence(rle(df1$observation)$lengths)` but that's exactly similar to @Aranda except I put the code in a more compact way. – Colonel Beauvel May 15 '15 at 04:26
  • You can do `sequence(rle(df1$observation)$lengths) * df1$observation` to keep zeros at zero. – JohannesNE Feb 03 '22 at 12:34