1

I am trying to find 'x' or more consecutive missing dates for each group in R. My current approach involves:

  • Using a for loop over each group
  • Find missing dates
  • Find how many of these missing dates are consecutive (here I get a logical vector, saying where the missing dates are consecutive or not.

This is where I am stuck. How to check from the logical vector, if "TRUE" occurs consecutively for 'x' number of times or higher.

logical_vector <- c("TRUE", "TRUE", "TRUE", "FALSE", "TRUE", "FALSE", "TRUE", "TRUE", "TRUE", "TRUE")

For example, in the above vector, how do you check if the value "TRUE" occurred 4 times or higher consecutively?

I think it is something very easy, but I cant figure this out and have been stuck for a while. Especially since the 'x' number of times or higher condition needs to be satisfied.

If it does occur 4 times or higher, should we store that as a logical vector as well?

Any help is appreciated.

GKi
  • 37,245
  • 2
  • 26
  • 48

2 Answers2

1

Updated

You can also use the following code for your purpose. I know a very good solution has already been presented, however, I did not want to leave my solution unfinished:

library(dplyr)
library(purrr)

# First I created a data frame of logical values

logical_vector <- c("TRUE", "TRUE", "TRUE", "FALSE", "TRUE", "FALSE", "TRUE", "TRUE", "TRUE", "TRUE")
logical_vector2 <- c("TRUE", "TRUE", "TRUE", "TRUE", "TRUE", "FALSE", "TRUE", "TRUE", "TRUE", "TRUE")
logical_vector3 <- c("TRUE", "TRUE", "FALSE", "FALSE", "TRUE", "FALSE", "TRUE", "FALSE", "TRUE", "TRUE")
logical_vector4 <- c("FALSE", "FALSE", "TRUE", "FALSE", "TRUE", "FALSE", "TRUE", "TRUE", "TRUE", "TRUE")

df <- data.frame(logical_vector, 
                 logical_vector2,
                 logical_vector3,
                 logical_vector4)

df %>%
  mutate(across(everything(), as.logical)) -> df


# Then I apply `rle` function on every column of it and count the runs of TRUEs among them and finally keep the elements with runs of TRUEs more than 4


map(df, rle) %>%
  map(~ .x$lengths[.x$values]) %>%
  keep(~ max(.x) > 4) -> df1

names(df1)
[1] "logical_vector2"

Anoushiravan R
  • 21,622
  • 3
  • 18
  • 41
  • 1
    This works, thank you. However, if you were to do this for more than 100 logical vectors, how would you modify the code, to output, which vector has 4 or more consecutive TRUE values. – learning_to_code Apr 27 '21 at 07:55
0

Keep logical values as logical, not string, and keep all your vectors in a list, then we can loop through them get the index where it meets the criteria, see example:

# example list of logical vectors 
l <- list(
  v1 = c(TRUE, TRUE, TRUE, FALSE, TRUE,  FALSE, TRUE,  TRUE, TRUE, TRUE),
  v2 = c(TRUE, TRUE, TRUE, FALSE, FALSE, FALSE, FALSE, TRUE, TRUE, TRUE),
  v3 = c(TRUE, TRUE, TRUE, TRUE,  TRUE,  FALSE, FALSE, TRUE, TRUE, TRUE))

# get index vector with 4 consequitive TRUE
ix <- sapply(l, function(i){
  r <- rle(i) 
  any(r$lengths[ r$values ] >= 4)
  })

#get the names of vectors
names(ix)[ ix ]
#[1] "v1" "v3"

# subset if needed
l[ ix ]
# $v1
# [1]  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE
# 
# $v3
# [1]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE
zx8754
  • 52,746
  • 12
  • 114
  • 209
  • Thanks a lot, this was easier. But again, my data is pretty huge. How do I extract just the index that has the pattern. In your example, the answer would v1 and v3, without printing the whole logical list. – learning_to_code Apr 27 '21 at 08:57
  • 1
    @learning_to_code Then you only need `names(ix)[ ix ]`. See edit. – zx8754 Apr 27 '21 at 08:58
  • @learning_to_code Based on your example data, and my example data, this works fine. Please update your post with better example data that matches your real data. – zx8754 Apr 27 '21 at 09:19
  • 1
    Sorry, since I was using it inside a function, I just need to print ix, without the names and that worked. Thanks a ton for your help! – learning_to_code Apr 27 '21 at 09:29