I'm trying to generate a rolling stepwise regression using rollapply() and step(), but noticed that the date range of the resulting zoo series is lagged. Below you'll find an illustration using two years of "data" (2016-01-01:2017-12-31) and a lookback period of 365 days for regression parameter estimation.
In the first case, a correct date range is generated (‘zoo’ series from 2016-12-30 to 2017-12-31). In the second case, a lagged date range is generated (‘zoo’ series from 2016-07-01 to 2017-07-02). The date range in the second case shouldn't be possible, since 365 days of observations is needed for the regressions and the data begins on 2016-01-01. It seems to me that the resulting rolling parameters are correct, but their corresponding date stamps are not.
All help is appreciated, thanks.
library(zoo)
date <- seq(as.Date("2016-01-01"), as.Date("2017-12-31"), by="days")
y <- rnorm(length(date))
x1 <- rnorm(length(date))
x2 <- rnorm(length(date))
z <- rnorm(length(date))
x3 <- z+5
all_vars <- cbind.data.frame(date, y, x1, x2, x3)
# First case: rolling regression that works, i.e. correct date range is
# generated
rolling_reg_works <- rollapply(zoo(all_vars[,-1], order.by = date),
width=365, FUN=function(z){
t = lm(formula=y~x1+x2+x3, data=as.data.frame(z))
return(t$coef)
}, by.column=FALSE, align="right")
str(rolling_reg_works)
# Second case: rolling regression that doesn't work, i.e. lagged date range
# is generated
rolling_reg_doesnt_work <- rollapply(zoo(all_vars[,-1], order.by = date),
width=365, FUN=function(z){
t = step(lm(formula=y~x1+x2+x3, data=as.data.frame(z)), direction =
"backward", k=2)
return(t$coef)
}, by.column = FALSE)
str(rolling_reg_doesnt_work)