Moving average with changing period in R

Question

I have a data frame named abc on which I'm doing moving average using rollapply. The following code works:

forecast <- rollapply(abc, width=12, FUN=mean, align = "right", fill=NA)

Now, I want to do the same thing with the width being variable, i.e. for the 1st month, it'll be empty, for the second month, first month's value will come. For the third month, it'll be (first month+second month/2), i.e. for the ith month, if i<=12, the value will be (sum(1:i-1)/(i-1)) and for i>=12 it will be the average of the last 12 months as done by the forecast. Please help.

G. Grothendieck · Accepted Answer · 2014-07-08T12:25:45.950

4

Here are some appraoches:

1) partial=TRUE

n <- length(x)
c(NA, rollapplyr(x, 12, mean, partial = TRUE)[-n])

Note the r at the end of rollapplyr.

2) width as list The width argument of rollapply can be a list such that the ith list element is a vector of the offsets to use for the ith rolling computation. If we specify partial=TRUE then offsets that run off the end of the vector will be ignored. If we only specify one element in the list it will be recycled:

rollapply(x, list(-seq(12)), mean, partial = TRUE, fill = NA)

2a) Rather than recycling and depending on partial we can write it out. Here we want width <- list(numeric(0), -1, -(1:2), -(1:3), ..., -(1:12), ..., -(1:12)) which can be calculated like this:

width <- lapply(seq_along(x), function(x) -seq_len(min(12, x-1)))
rollapply(x, width, mean)

This one would mainly be of interest if you want to modify the specification slightly because it is very flexible.

Note: Later in the comments the poster asked for the same rolling average except for it not to be lagged. That would be just:

rollapplyr(x, 12, mean, partial = TRUE)

Note the r at the end of rollapplyr.

Update Some improvements and additional solutions.

edited Jul 08 '14 at 12:25

answered Jul 08 '14 at 10:47

G. Grothendieck

254,981
17
203
341

@g-grothendieck Thanks. But here, what `width` doing is taking all the previous values. say: `width[10]` `[1] -1 -2 -3 -4 -5 -6 -7 -8 -9` This is fine. But, `width[15]` `[1] -1 -2 -3 -4 -5 -6 -7 -8 -9 -10 -11 -12 -13 -14` This is not what I want. I want what you did upto [12] but, after 12, it will not take all the previous values, but only the last 12 values. i.e. width is constant at 12 from 13th element onwards. So, `width[15]` will be: `-3 -4 -5 -6 -7 -8 -9 -10 -11 -12 -13 -14` Please see. – user3815746 Jul 08 '14 at 11:27
Have modified to take at most 12 values. – G. Grothendieck Jul 08 '14 at 11:30
@g-grothendieck Thanks again, it serves my purpose, but I couldn't figure out a small thing. What change do I have to make, if I want to shift the results up? I mean I want `width[1]` to be -1 instead of `integer(0)`, `width[2]` to be `-1 -2` and so on. Say, in the 25th row, I would like `sum(x[14]:x[25])/12` instead of `sum(x[13]:x[24])/12`. Thanks. – user3815746 Jul 08 '14 at 11:48
Remove the -1 or just `rollapplyr(x, 12, mean, partial = TRUE)`. Also note the simpler solution just added. – G. Grothendieck Jul 08 '14 at 11:54
getting different results. Using `width` without -1: `[1] 0.08830709 0.08916036 0.09122831 0.09407099 0.09144008 0.09331063 0.09433104` `[8] 0.09823444 0.09457298 0.08625747 0.08442378 0.08149499 0.07916721 0.07667792` Using `rollapply(..,partial = TRUE)`: `[1] 0.08055410 0.08334492 0.08765026 0.08788178 0.08664798 0.08830709 0.08916036 0.09122831 0.09407099 0.09144008 0.09331063 0.09433104 0.09823444 0.09457298 0.08625747 0.08442378 0.08149499` `[18] 0.07916721 0.07667792 0.07836255 0.07783986 0.08256117 0.08351636 0.08145590 0.06648748 0.06774488` Even the no. of data points are different. – user3815746 Jul 08 '14 at 12:05
See Note at the end of the answer and don't forget the r at the end of `rollapplyr`. – G. Grothendieck Jul 08 '14 at 12:10
Nice answer @G.Grothendieck, I wasn't aware of the `rollapplyr` function before now – jogall Jul 08 '14 at 12:50
@G.Grothendieck, thanks for your excellent answer. But, first method under 2a, using width giving the following result `[1] NaN 1.000000 1.500000 2.000000 2.500000 3.200000 3.833333 3.714286 [9] 3.750000 3.888889 4.300000` while `rollapplyr` gives `[1] 1.000000 1.500000 2.000000 2.500000 3.200000 3.833333 3.714286 3.750000 [9] 3.888889 4.300000 4.454545` How do I get the last value using width function? – user3815746 Jul 10 '14 at 10:12
The note refers to the a different problem than the prior code so clearly the answer would be different. – G. Grothendieck Jul 10 '14 at 10:41

Moving average with changing period in R

1 Answers1