0

Let's say I have a simple toy vector in R like:

x = seq(1:10);x
 [1]  1  2  3  4  5  6  7  8  9 10

I want to use the rollapply function from zoo package but in a different way.Rollapply calculates a function from a vector x with width argument to be a rolling window.I want instead of rolling to be expanding.There is similar question here and here but they don't help me with my problem.

For example what I want to calculate the sum of the first observations of vector x and then expand the window but by 2.

Doing so I did :

rollapplyr(x, seq_along(x) ,sum,by=2,partial = 5,fill=NA)
 [1] NA NA NA NA 15 21 28 36 45 55

or replace the NA's

na.locf0(rollapplyr(x, 5 ,sum,by=2,partial = 5,fill=NA))
[1] NA NA NA NA 15 15 25 25 35 35

But what I ideally want as a result is:

 [1] NA NA NA NA 15 15 28 28 45 45

Imagine that my dataset is huge (contains 2500 time series observations) and the function is some econometric - statistical model not a simple one like the sum that I use here.

How can I do it? Any help ?

Homer Jay Simpson
  • 1,043
  • 6
  • 19

2 Answers2

1
x <- seq(10)

expandapply <- function(x, start, by, FUN){
  # set points to apply function up to
  checkpoints <- seq(start, length(x), by)
  # apply function to all windows
  vals <- sapply(checkpoints, function(i) FUN(x[seq(i)]))
  # fill in numeric vector at these points (assumes output is numeric)
  out <- replace(rep(NA_real_, length(x)), checkpoints, vals)
  # forward-fill the gaps
  zoo::na.locf(out, na.rm = FALSE)
}


expandapply(x, start = 5, by = 2, FUN = sum)
#>  [1] NA NA NA NA 15 15 28 28 45 45

Created on 2022-03-13 by the reprex package (v2.0.1)

IceCreamToucan
  • 28,083
  • 2
  • 22
  • 38
1

Define nonNA as the positions which should not be NA. You can change x and nonNA to whatever you need.

Then assign w a vector of widths to use using zero for those components which are to be NA. Finally apply na.locf0.

(The two extreme cases are that if nonNA is seq_along(x) so that all elements are not to be NA'd out then this is the same as rollapplyr(x, seq_along(x), sum) and if nonNA is c() so that there are no non-NAs then it returns all NAs.)

library(zoo)

x <- 1:10
nonNA <- seq(5, length(x), 2)

w <- ifelse(seq_along(x) %in% nonNA, seq_along(x), 0)
na.locf0(rollapplyr(x, w, function(x) if (length(x)) sum(x) else NA, fill=NA))
##  [1] NA NA NA NA 15 15 28 28 45 45

Another way is to use a list for thewidth= argument of rollapply whose components contain the offsets. x and nonNA are from above.

L <- lapply(seq_along(x), function(x) if (x %in% nonNA) -seq(x-1, 0))
na.locf0(rollapplyr(x, L, sum, fill = NA))
##  [1] NA NA NA NA 15 15 28 28 45 45

Update

Simplified solution and added second approach.

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341