I am trying to find the most efficient way to identify the longest rolling sum of a vector where the value is below a specific threshold. For instance, if we had 1:10
and a threshold of 6
, then 3
is our longest rolling sum.
I have got the below code and function I've made for this, but is obviously extremely slow. Wondering if there's a more efficient or already implemented algorithm for this that identifies the longest runs with sums under the threshold.
library(dplyr)
library(zoo)
set.seed(1)
df <- data.frame(
g = rep(letters[1:10], each = 100),
x = runif(1000)
)
longest_rollsum <- function(x, threshold) {
for (i in 1:length(x)) {
rs <- rollsum(x, i, na.pad = TRUE, align = "right")
if (!any(rs <= threshold, na.rm = TRUE)) {
return(i - 1)
}
}
return(i)
}
df %>%
group_by(g) %>%
summarize(longest = longest_rollsum(x, 2))
#> # A tibble: 10 × 2
#> g longest
#> <chr> <dbl>
#> 1 a 7
#> 2 b 6
#> 3 c 9
#> 4 d 6
#> 5 e 6
#> 6 f 8
#> 7 g 6
#> 8 h 7
#> 9 i 11
#> 10 j 9