Efficient reduction of an R vector to a summary vector

Question

I'm trying to do simulation of a sequence, which is of length N (varies between 10k and 3M) represented by a vector which has n 1's and s 0's where N = n+s.

I'd like to reduce this to a vector on the form c( 137, 278, 21271, 124162, ... ) where the numbers are the number of consecutive 1's in the original vector. Since I need to do this ~100,000 times for the simulation I'm doing I'm looking for as efficient a method as possible!

Thanks!

Martin

What are you trying to accomplish by putting the vector in that form? — Joshua Ulrich, Mar 21 '13 at 17:50
@JoshuaUlrich - I'm trying to estimate the probabilities of certain lengths of 1's and 0's given different values, and I'm too bad at probability theory to calculate the exact answer. — Norling, Mar 21 '13 at 18:07

score 3 · Accepted Answer · answered Mar 21 '13 at 17:53

you can use rle to get that

x <- sample(c(1, 0), size = 3e+06, replace = TRUE)
x.rle <- rle(x)
x.rle
## Run Length Encoding
##   lengths: int [1:1499270] 4 1 2 3 4 1 1 3 1 4 ...
##   values : num [1:1499270] 0 1 0 1 0 1 0 1 0 1 ...

vectorOf1 <- x.rle$lengths[x.rle$values == 1]
vectorOf2 <- x.rle$lengths[x.rle$values == 0]

head(vectorOf1, 20)
##  [1] 1 3 1 3 4 3 1 1 1 4 4 2 3 1 1 4 1 1 1 1

head(vectorOf2, 20)
##  [1] 4 2 4 1 1 1 1 5 2 2 2 1 3 3 7 2 1 1 1 2

score 0 · Answer 2 · answered Mar 21 '13 at 17:53

0

The rle function is the usual manner for doing this.

answered Mar 21 '13 at 17:53

IRTFM

258,963
21
364
487

Efficient reduction of an R vector to a summary vector

2 Answers2