2

This R script generates ensembles of time series. The series are derived from function f(t) = alpha * f(t-0) + epsilon, where epsilon is a random number from a normal distribution.

The final result is a list of ensembles generated from different values of alpha.

How can it be vectorized? Using base functions would be great, but solutions that require additional packages are welcome too.

steps <- 1000 # number of times each "pseudo random walk" is iterated
N <- 5 # number of walks in each ensemble

mu <- 0 # normal distribution mean
mysd <- 1 # normal distribution standard deviation

alphas <- c(0, 0.5, 0.7, 1, 1.5, 2) # set of different alphas to generate ensembles with


# Pseudo random walk generator
generate.rw <- function(steps, alpha, my, mysd) {
  epsilons <- rnorm(n = steps, mean = mu, sd = mysd)
  rw <- vector(,steps)
  rw[1] <- epsilons[1]
  for (i in 2:steps) rw[i] <- alpha * rw[i-1] + epsilons[i]
  return(rw)
}

# Ensemble generator
ensemble <- function(N, steps, alpha, mu, mysd) {
  result <- matrix(,N,steps)
  for (i in 1:N) result[i,] <- generate.rw(steps, alpha, my, mysd)
  return(result)
}

# Get a list of ensembles with different values of alpha
ensembles <- lapply(alphas, ensemble, steps = steps, N = N, mu = mu, mysd = mysd)
HAVB
  • 1,858
  • 1
  • 22
  • 37

2 Answers2

2

You may start with using

filter(rnorm(steps, mu, mysd), alpha, "recursive")

for generate.rw and

replicate(generate.rw(steps, alpha, mu, mysd), n = N)

for ensemble. By the way, with alpha different from 1 it is not really a random walk; check autoregressive processes of order 1 (AR(1) for short) and ?arima.sim (alternative to filter).

Julius Vainora
  • 47,421
  • 9
  • 90
  • 102
  • Simple and clean. Excellent. Also, thanks for the heads up about incorrect use of the term "random walk". I have changed the title of my question following your suggestion. – HAVB Feb 18 '16 at 02:19
0

You might want to (re)look at the Wold Theorem. The idea is that if you recursively solve your AR(1) it will be easy to vectorize. Strictly speaking the Wold Theorem only applies/makes sense when the series do not follow a stochastic trend (i.e. the alpha is less than 1), but more about that later.

This is the recursive solution to the model Yt = alpha Yt-1 + epsilon_t:

Yt = Sum alpha^i * epsilon_t-i.

A vectorized solution is now obvious.

res = rep(list(NA),length(alpha))

for (i in 1:length(alpha)){

  epsilon = rnorm(n = steps, mu, mysd)

  alpha_power = alpha[i]^seq(0,(steps-1))

  res[[i]] = alpha_power%*%epsilon

  #or if you want to save each Yt, alpha_power*epsilon 
}

The above code does loop over the alphas. There are ways to avoid this loop, however, given the relative small number of alphas I feel like there is no need to. The most important thing is that I vectorized the most expensive part of your code (i.e. the part which required numerous iterations)

This approach will be faster than Julius' approach, I think, because it truly is vectorized. I believe that replicate is part of the apply family, though I could be wrong.

Finally, when alpha > 1, your model does not really make any sense. When alpha < 1, as you can see above in the equation the shocks die off and the nearest shocks are given the most weight, e.g. .5^100*.5 is essentially zero. When alpha > 1 the weight on a shock increases over time, e.g. 2^100*.5 is really really big. Said another way, your model essentially has no predictive power as your series can be pretty much anywhere after only a few steps into the future.

Last thing, you should, as Julius suggests, change the title of your question. An AR(1) follows a random walk if and only if alpha = 1.

Jacob H
  • 4,317
  • 2
  • 32
  • 39
  • I like how you frame the problem using a recursive definition. Checking Wold's theorem was definitely useful, I appreciate the tip. Alas, the code that you provided does not quite work: for alpha = 2, steps = 3, and epsilons = c(e1, e2, e3) it does: e1 + 2*e2 + 4*e3 When it should be e1 + 2*e1 + e2 + 2 *(2*e1 + e2) + e3 BTW, the reason why I'm using alphas that make no sense for prediction (>1) is because this is actually a class exercise, so I assume the idea is to help us differentiate when a process can be described as a random walk, and when it doesn't – HAVB Feb 18 '16 at 02:31
  • Read this http://www.econ.ohio-state.edu/dejong/note2.pdf. I was a bit fast and loose with my terms. I meant recursive substitution. For example, if you have Yt = aYt-1 + et, Yt-1 = aYt-2 +et-1 and Yt-2 = aYt-3 +et-2. If recursively sub you get, Yt= a^3Yt-3 + a^2et-2 + aet-1 + et. Keep recursively subing until you get to Y0 which yields the formula outlined above. For a more formal and general answer look into a solution with a lag operator. Take another look and I think you'll see I'm right. If not let me know. I had a hard time following your math so I might be missing something. – Jacob H Feb 18 '16 at 05:43
  • Thanks for the note. It makes sense that this is a course assignment. – Jacob H Feb 18 '16 at 05:44
  • I checked the ARMA lecture, ty. About your code, run it and the compare with filter(epsilon, alpha[1], "recursive"). You'll see that the result is different. Same if you compare it with my code, that'll get you the same as filter(). – HAVB Feb 18 '16 at 15:55