Get length of runs of missing values in vector

Question

What's a clever (i.e., not a loop) way to get the length of each spell of missing values in a vector? My ideal output is a vector that is the same length, in which each missing value is replaced by the length of the spell of missing values of which it was a part, and all other values are 0's.

So, for input like:

x <- c(2,6,1,2,NA,NA,NA,3,4,NA,NA)

I'd like output like:

y <- c(0,0,0,0,3,3,3,0,0,2,2)

score 10 · Accepted Answer · answered Mar 21 '17 at 20:00

10

One simple option using rle:

m <- rle(is.na(x))
> rep(ifelse(m$values,m$lengths,0),times = m$lengths)
[1] 0 0 0 0 3 3 3 0 0 2 2

answered Mar 21 '17 at 20:00

joran

169,992
32
429
468

2

`rep(rle(is.na(x))$value * rle(is.na(x))$length, rle(is.na(x))$length)`. This also works. – JasonWang Mar 21 '17 at 20:03

smci · Answer 2 · 2017-03-21T20:23:17.570

I was independently working on something using rle() and either cumsum() or dplyr group_by() and n() to get group-lengths of NAs:

> x2 <- as.numeric(is.na(x))
  0 0 0 0 1 1 1 0 0 1 1

> rle(x2)
Run Length Encoding
  lengths: int [1:4] 4 3 2 2
  values : num [1:4] 0 1 0 1

# Now we can assign group-numbers...
> cumsum(c(diff(x2)==+1,0)) * x2
  0 0 0 0 1 1 1 0 0 2 2
# ...then get group-lengths from counting those...
> rle(cumsum(c(diff(x2)==+1,0)) * x2)
Run Length Encoding
  lengths: int [1:4] 4 3 2 2
  values : num [1:4] 0 1 0 2

We could kludge something, but it won't be as compact and elegant as @joran's solution.

score 1 · Answer 3 · answered Mar 22 '17 at 03:04

1

Here is another option with rleid and ave

library(data.table)
ave(x, rleid(is.na(x)), FUN = length)*is.na(x)
#[1] 0 0 0 0 3 3 3 0 0 2 2

answered Mar 22 '17 at 03:04

akrun

874,273
37
540
662

Get length of runs of missing values in vector

3 Answers3

Linked