3

I have a column of hourly data and want to use rollapply to calculate the 24-hour rolling average for every hour. My data contains NA's and I only want to calculate the rolling average if 75% of the data for one 24-hour period is available, otherwise I wish for the 24-rolling average to be considered NA.

  df %>%
        mutate(rolling_avg = rollapply(hourly_data, 24, FUN = mean ,align = "right", fill = NA ))

How can I modify the above code to accomplish this?

MaddieS
  • 71
  • 7

1 Answers1

3

Define a function to do exactly what you stated:

f <- function( v ) {
  if( sum(is.na(v)) > length(v)*0.25 ) return(NA)
  mean(v, na.rm = TRUE)
}

Then use it in place of mean:

df %>% mutate(rolling_avg = rollapply(hourly_data, 24, FUN = f, 
                                     align = "right", fill = NA ))
Artem Sokolov
  • 13,196
  • 4
  • 43
  • 74