0

I have a time series and want to use period.apply() function xts library to estimate the mean for 377 days

The reproducible example is as following

zoo.data <- zoo(rnorm(5031)+10,as.Date(13514:17744,origin="1970-01-01"))
ep <- endpoints(zoo.data,'days', k =377)
period.apply(zoo.data, INDEX=ep, FUN=function(x) mean(x))

The output generated is

2007-05-28 2007-12-31 2008-10-05 2008-12-31 2009-02-02 2009-12-31 
  9.905663   9.800760  10.006344  10.052163  10.152453  10.032073 
2010-06-13 2010-12-31 2011-10-22 2011-12-31 2012-02-18 2012-12-31 
  9.879439  10.038644   9.957582   9.977026   9.959094  10.004348 
2013-06-29 2013-12-31 2014-11-07 2014-12-31 2015-03-06 2015-12-31 
 10.004620  10.086071   9.902875   9.843695   9.851306  10.072610 
2016-07-14 2016-12-31 2017-11-23 2017-12-31 2018-03-22 2018-08-01 
  9.966911  10.199251  10.001628  10.263590  10.181235  10.059080 

The output is unexpected as the difference in each date is not 377. The output shows that its stops at year end 20xx-12-31 before moving on to next endpoints

Azam Yahya
  • 646
  • 1
  • 7
  • 10

1 Answers1

1

I am not sure that you could solve this using endpoints function directly. Here is one way to solve it using built-in functions. It is a slightly general solution. In the code below, you can uncomment the commented lines to print the number of observations in the last interval.

library(xts)

apply.fun <- function(data, variable=1, fun=mean, k=377) {  # variable: variable name or column index
  data      <- as.xts(data)
  variable  <- data[, variable, drop=TRUE]
  idx       <- index(data)
  byindex   <- as.integer(idx - first(idx)) %/% k           # intervals idendifiers
  endates   <- idx[!duplicated(byindex, fromLast=TRUE)]
  ans       <- setNames(tapply(variable, byindex, fun), endates)
  #inter.end <- sum(byindex==last(byindex))
  #if(inter.end < k) cat(sprintf("Last internal has fewer observations: %d<k=%d\n\n", inter.end, k))
  return(as.xts(as.matrix(ans)))
}

set.seed(147)
zoo.data <- zoo(rnorm(5031)+10,as.Date(13514:17744,origin="1970-01-01"))

apply.fun(zoo.data, 1, mean)   

 #                 [,1]
 # 2008-01-12 10.043735
 # 2009-01-23 10.042741
 # 2010-02-04  9.957842
 # 2011-02-16 10.016998
 # 2012-02-28  9.932871
 # 2013-03-11  9.932731
 # 2014-03-23 10.045344
 # 2015-04-04 10.015821
 # 2016-04-15 10.015023
 # 2017-04-27 10.038887
 # 2018-05-09  9.978744
 # 2018-08-01 10.004074