Good day/evening all,
Hope you are well. I would be grateful if someone could give me some hints about how to solve this problem.
I have variable x with dates attached. I would like to create a rolling average of x, but I would like the average to skip the most recent ten dates, and instead take the average of the ten dates prior to that. The average should also be grouped by a categorical variable.
Here is a minimally reproducible example:
set.seed(24)
dat <- data.frame(id=1:93,
date = rep(seq(as.Date("1990/1/1"),as.Date("2020/1/1"),"years"),each=3),
group = rep(LETTERS[1:3],31),
x=rnorm(93))
For example, the new variable, rolling, for 2020, group A (id = 91), should be the average from 2001 through 2010, but only from A years. (Rolling averages for 1990-1999 don’t matter in this example.)
I’ve gotten as far as the grouped rolling average, but I haven’t figured out how to “skip” yet. I think I probably just don’t know the right vocabulary. The slider package has worked best for me so far. This code just makes a rolling average of the past ten dates:
library(dplyr)
library(slider)
newdat <- dat %>% group_by(group) %>%
mutate(rolling = slide_dbl(x, mean, .after = 9))
I am at your disposal for any further information. Thank you for reading and for any tips!