Let me share an example of what I'm trying to do, since the title may not be as clear as I'd like it to be.
data <- tibble(week=1:10,name=c(rep("Joe",10)),value=c(.9,.89,.99,.98,.87,.89,.93,.92,.98,.9),
wanted = c("Yes","Skip","No","No","Yes","Skip","Yes","Skip","No","Yes"))
data <- data %>% mutate(my_attempt = case_when( week-lag(week)==1 &
value < .95 &
lag(value) < .95 &
lag(value,2) >= .95 &
!is.na(lag(value,2))~ "Skip",
week-lag(week)==1 &
value < .95 &
lag(value) < .95 &
is.na(lag(value,2))~ "Skip",
value < .95 ~"Yes",
TRUE ~ "No"))
# week name value wanted my_attempt
# <int> <chr> <dbl> <chr> <chr>
# 1 Joe 0.9 Yes Yes
# 2 Joe 0.89 Skip Skip
# 3 Joe 0.99 No No
# 4 Joe 0.98 No No
# 5 Joe 0.87 Yes Yes
# 6 Joe 0.89 Skip Skip
# 7 Joe 0.93 Yes Yes
# 8 Joe 0.92 Skip Yes
# 9 Joe 0.98 No No
# 10 Joe 0.9 Yes Yes
I am trying to get the my_attempt column to produce the results of the wanted column. I want to identify rows when the value is less than a certain threshold, but there can't be two consecutive "yes" values. My attempt at it works until it sees 4 or more low values in a row. In my real data some weeks may be missing but that can be treated as a "No". For example, if week 6 was missing it would still be okay for week 7 to be "Yes" (I think the first line in my case when takes care of this). Is there a way to do this in R? It doesn't have to be consistent with dplyr but it would be nice if it's possible within tidyverse.