0

Given a simple vector of 82 observation

x = c(102, 104, 89, 89, 76, 95, 88, 112, 81, 101, 101, 104, 94, 111, 108, 104, 93, 92, 86, 113, 93, 100, 92, 80, 92, 126, 102, 109, 104, 95, 84, 81, 103, 83, 103, 83, 58, 109, 89, 93, 104, 104, 123, 104, 93, 76, 103, 103, 100, 105, 108, 90, 122, 103, 114, 102, 87, 98, 88, 107, 102, 80, 81, 96, 107, 105, 113, 98, 93, 104, 94, 107, 107, 97, 102, 82, 90, 97, 124, 109, 96, 92)

I would like to perform EMA (Exponential Moving Average) of this vector in such a way:

  • The 1st element of the new vector should be NA

  • The 2nd element should be the first element of the original vector

  • The 3rd element should be the EMA of the first and second elements of the original vector

  • The 4th element should be the EMA of the first three elements of the original vector

    ...

  • The 82th element should be the EMA of all the values of the original vector except for the last one

The ideas are to to place greater weight on the most recent vector elements and that also the last elements of the new vector is affected (although infinitesimally) by the first elements of the orginal vector.

I tried achieving this using the function EMA from the package TTR and lag from dplyr

> library(dplyr)
> library(TTR)
> lag(EMA(x, 1, ratio = 2/(81+1)))

 [1]        NA 102.00000 102.04878 101.73052 101.42002 100.80002 100.65855 100.34981 100.63396 100.15508 100.17569
[12] 100.19579 100.28858 100.13520 100.40020 100.58556 100.66884 100.48179 100.27492  99.92675 100.24561 100.06889
[23] 100.06721  99.87045  99.38580  99.20566  99.85918  99.91139 100.13307 100.22738 100.09989  99.70721  99.25093
[34]  99.34237  98.94378  99.04271  98.65143  97.65993  97.93651  97.71855  97.60346  97.75948  97.91168  98.52360
[45]  98.65717  98.51919  97.96994  98.09262  98.21231  98.25592  98.42041  98.65405  98.44298  99.01754  99.11468
[56]  99.47773  99.53925  99.23342  99.20333  98.93008  99.12691  99.19698  98.72876  98.29635  98.24035  98.45400
[67]  98.61365  98.96454  98.94102  98.79611  98.92304  98.80296  99.00289  99.19794  99.14433  99.21398  98.79413
[78]  98.57964  98.54111  99.16206  99.40201  99.31903

But that's definetely not the result I was looking for.... what I'm doing wrong? I wasn't able to find any comprehensive documentation about ratio argument on internet, and I'm not sure I've got this clear. Can anyone please help me?

To get things more clear: the result i reached until now is the following:

> library(runner)
> mean_run(x, k = 7, lag = 1)

 [1]        NA 102.00000 103.00000  98.33333  96.00000  92.00000  92.50000  91.85714  93.28571  90.00000  91.71429
[12]  93.42857  97.42857  97.28571 100.57143 100.00000 103.28571 102.14286 100.85714  98.28571 101.00000  98.42857
[23]  97.28571  95.57143  93.71429  93.71429  99.42857  97.85714 100.14286 100.71429 101.14286 101.71429 100.14286
[34]  96.85714  94.14286  93.28571  90.28571  85.00000  88.57143  89.71429  88.28571  91.28571  91.42857  97.14286
[45] 103.71429 101.42857  99.57143 101.00000 100.85714 100.28571  97.71429  98.28571  97.85714 104.42857 104.42857
[56] 106.00000 106.28571 103.71429 102.28571 102.00000  99.85714  99.71429  94.85714  91.85714  93.14286  94.42857
[67]  96.85714  97.71429  97.14286  99.00000 102.28571 102.00000 102.00000 102.28571 100.00000 100.57143  99.00000
[78]  97.00000  97.42857  99.85714 100.14286 100.00000

So this is the Simple Moving Average (SMA) over k=7 observations i obteneid with mean_run function from runner package. Now i would like to "improve" this moving average by placing exponential increasing weights on each observations and making sure that also the last element is affected by the first one (the weight for that observation should be as close as possible to 0). That means that the window sizes for the rolling average will be:

  • n=0 for the 1st element (i.e. NA)

  • n=1 for the 2nd element (i.e the 1st element of the original vector)

  • n=2 for the 3d element (i.e. the EMA of the 1st and 2nd elements)

  • n=3 for the 4th element (i.e. the EMA of the 1st,2nd and 3rd elements)

    ...

  • n=81 for the 82th element (i.e. the EMA of the first 81 elements)

I still wasn't able to find any good documentation about the ratio arguments (i.e alpha), but i think it should be settled aribitrarily, but I'm not sure about that

Machavity
  • 30,841
  • 27
  • 92
  • 100
  • 1
    The phrase "exponential moving average" implies a trailing function application. You seem to be asking for a forward-looking function application. Perhaps you would be happier with `EMA()` applied to a reversed version of that vector? OR ... you could give a mathematical definition of what you conceive to be an "EMA". – IRTFM May 25 '21 at 01:08
  • How would you define EMA mathematically ? – Onyambu May 25 '21 at 01:19
  • Maybe. I cannot really know unless you describe your mathematical expectations. – IRTFM May 25 '21 at 01:20
  • @IRTFM I wrote in my question how i expect the vector to be. I'm trying to perform a rolling mean that places exponential weights on each element: the last elements has greater influence than the first elements – Alberto Ferrario May 25 '21 at 01:38
  • For the given x above, could you give the first 5 expected values? ie the first is `NA` the second will be 102, what is the 3rd, 4th and 5th numbers that you expect? – Onyambu May 25 '21 at 01:39
  • Also define your alpha – Onyambu May 25 '21 at 01:40
  • @Onyambu how can i estimate alpha? – Alberto Ferrario May 25 '21 at 01:51
  • Why do you want to estimate alpha? Are you doing some modeling? You havent solved the EMA problem yet. This question hasn't been answered yet because you havent given all the necessary information – Onyambu May 25 '21 at 01:53
  • @Onyambu I dont know the exact results, but for exemple the 3rd will be a weighted average of 102 and 104, with greater weights on 104, so somehting like 103.2. But I thought that EMA function deals with these estimations by itslef – Alberto Ferrario May 25 '21 at 01:54
  • Well it is advisable to know how to compute the values by hand before using software. In that case you can be able to tell whether the results are correct or not – Onyambu May 25 '21 at 02:02
  • doesn`t `EMA(x, 2)` solve your question? – Onyambu May 25 '21 at 02:19
  • @Onyambu I edited my original question, I hope it is more clear now, and no, the code you wrote it doeasnt solve it – Alberto Ferrario May 25 '21 at 09:41
  • Have you tried `EMA(x, 2)`? This seems to do what you want – Onyambu May 25 '21 at 15:57
  • 3
    Please do not vandalize your posts. By posting on this site, you've irrevocably granted the Stack Exchange network the right to distribute that content under the CC BY-SA 4.0 license for as long as it sees fit to do so. For alternatives to deletion, see: [I've thought better of my question; can I delete it?](https://stackoverflow.com/help/what-to-do-instead-of-deleting-question) – Ollie Jun 04 '21 at 13:24

1 Answers1

1

Assuming that you intended what you wrote, i.e. a lagged exponential moving average and not a lagged weighted moving average defined in a comment, we define the iteration in iter and then use Reduce like this.

alfa <- 2/(81+1)
iter <- function(y, x) alfa * x + (1-alfa) * y
ema <- c(NA, head(Reduce(iter, tail(x, -1), init = x[1], acc = TRUE), -1))

# check

identical(ema[1], NA_real_)
## [1] TRUE
identical(ema[2], x[1])
## [1] TRUE
identical(ema[3], alfa * x[2] + (1-alfa) * x[1])
## [1] TRUE
identical(ema[4], alfa * x[3] + (1-alfa) * ema[3])
## [1] TRUE

The lagged weighted moving average in the comment is not an exponential moving average and is unlikely what you want but just to show how to implement it if the second argument to rollapply is a list containing a vector then that vector will be regarded as the offsets to use.

library(zoo)
c(NA, x[1], rollapplyr(x, list(-seq(2)), weighted.mean, c(alfa, 1-alfa)))
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • You need to refine your idea of what you want and present it clearly with input and hand calculated output; however, it is likely what has already been shown in the answer is what you want and you just need to recognize that an exponential moving average can be represented in multiple ways as per link in the answer. – G. Grothendieck May 25 '21 at 14:25