4

<code>x(n) = \sum_{j=1}^p a_jx(n-j) + u(n)</code>

is an univariate autoregressive AR model of order p = 2 and data samples N excited by u which is a Gaussian zero mean noise and variance sigma_u^2.

I am aware of the function cov(x) but how do I use it to get the pxp covariance matrix C(n) and its inverse, S(n) = C(n) for the coefficients of the AR model. The mathematical formula for inverse covariance matrix, S(n) is expressed as

Invcovformula

where A_1 and A_2 are the lower triangular Toeplitz matrices.

InverseCOv

Can I directly do pinv(Cov(coefficients))? I am also unsure how to pass the arguments as the AR coefficients into the function.

How to implement this formula? thank you for help

SKM
  • 959
  • 2
  • 19
  • 45
  • As you can tell, StackOverflow doesn't support TeX/LaTeX. Please remove it (and definitely don't format it as code unless it actually is code). If you absolutely need to show an equation, use an image as you've already done. – horchler Apr 28 '15 at 23:26
  • @horchler : I have put up an image for the Equation of the AR model, thank you :) – SKM Apr 29 '15 at 01:27
  • I know the `cov` stuff but not ARs. What's the output of the AR model you have? The `cov()` function is for numerically calculating covariances of variables, given a bunch of observations/samples of the variables themselves (input = an observations x variables array); not sure that's the case here. If you want the covs of the coefficients, do you have a formula to produce samples of the coefficients themselves? (Are the `a_i` values those coefficients?) There may be a closed form solution for the AR, but I don't know what it is. – Andrew Janke Apr 29 '15 at 01:58
  • 2
    @AndrewJanke: Thank you for your input & observation. The coefficients a_i's are estimated using Maximum Likelihood estimator. The lower bound of the Mean Square error of the coefficients is generally determined by the Cramer Rao Lower bound (CRLB). The formula for CRLB contains the term inverse of the covariance of the pbyp matrix of the coefficients. I know that covariance of the data can be found, but I do not know how to find for the coefficients case for any time series model. The output of the AR model is a one dimension time series from which we estimate the unknown coefficients. – SKM Apr 30 '15 at 16:07
  • Thanks for the extra info. I'm afraid I'm out of my depth here. Hope someone else who's more familiar with AR models can chime in. – Andrew Janke Apr 30 '15 at 16:42
  • 1
    I think your question is ill-posed. Considering your comment, I'm assuming that you are not interested in the covariance of the time series, but rather in the covariance of the Maximum Likelihood (ML) estimate of the coefficient of the model. If this is true, the `cov` function is not what you are looking for. You should simply build the `p`-by-`p` matrices `A_1` and `A_2` using the estimated coefficients `a_j`, given that the ML estimator also provides an estimate of the noise variance `sigma` (supposedly it does). – lmillefiori Apr 30 '15 at 16:50
  • @lmillefiori: You are right, and this I what I do not know how to do -- how to build the covariance matrices using the estimated coefficients and the noise variance is known (not estimated). Can you help me with the code for this? Thank you for your valuable insights. – SKM May 01 '15 at 01:06
  • @SKM: You are really close to the solution... If you need more help, you need to provide more information on the working of the specific ML estimator. There's something unclear in how the matrix `A_2` it's defined... – lmillefiori May 04 '15 at 08:08

1 Answers1

1

Your question is entirely ill posed, as others have noted, so I would encourage you to do some reading and think about what you really want to do first. Nonetheless, I can provide some guidance, which really has more to do with straightening out the concepts than programming. Based on the question and the comments, your question should be re-phrased as, "How does one estimate the covariance matrix for the parameters of an AR(n) model in which the parameters are obtained via maximum likelihood estimation?" In this case we can answer as follows.

  1. For the sake of discussion let's first generate some test data.
>> rng(0);
>> n = 10000;
>> x = rand(n,2);
  1. First obtain initial conditions smartly by running OLS. This is how most people estimate autoregressions anyway (nearly the entire literature on vector autoregressions) and OLS and MLE are asymptotically equivalent under some conditions. Include a column of zeros if you want to include a constant (which you want to do if you are working with non-demeaned data). In the output b is your parameter estiamte vector and C is the corresponding vector of standard errors.
>> X = [x,ones(n,1)];
>> [b,C]=lscov(X,y)

b =

    4.9825
    9.9501
   20.0227


C =

    0.0347
    0.0345
    0.0266
  1. Before you can do maximum likelihood you need a likelihood first. We can just construct one as the composition of a couple of simple anonymous functions. We also need to construct an initial conditions vector that includes the standard deviation of the prediction error because this is one of the things we are going to estimate using MLE. In this case I am going to assume that you would like a Normally distributed error, and base my example on that, but you didn't say and of course could choose anything (not having a normally distributed error is typically the motivation for not doing OLS). Also, I will build a negative log likelihood since we are using minimization routines.
>> err = @(b) y - X*b

err = 

    @(b)y-X*b

>> std(err(b))

ans =

    0.9998

>> b0 = [b; std(err(b))];
>> nll = @(b) (n/2)*log(2*pi*b(end)^2) + (1/(2*b(end)^2))*sum(err(b(1:end-1)).^2);
  1. Next use some type of minimization routine (e.g., fminsearch, fminunc, etc.) to do MLE.
>> bmle = fminsearch(nll,b0)

bmle =

    4.9825
    9.9501
   20.0227
    0.9997

Unsurprisingly, the estimates are almost identical to what we obtained under OLS, which is exactly what we expect when the errors are normally distributed, and which is why most people opt for just doing OLS unless there is some compelling reason to think the errors are non-Normal. The covariance matrix can be estimated by the outer product of the scores, which is a particularly simple expression in the Normal case.

>> inv(X'*X/bmle(end))

ans =

    0.0012    0.0000   -0.0006
    0.0000    0.0012   -0.0006
   -0.0006   -0.0006    0.0007

Finally, the standard errors match what we obtained in the least squares case.

>> sqrt(diag(inv(X'*X/bmle(end))))

ans =

    0.0347
    0.0345
    0.0266

EDIT: Sorry, I just realized my test data was cross sectional instead of time series data. I will fix this as soon as I get time. But the methodology for estimating the models stays the same.

  • Sorry for replying in so late. I was applying your techniques in my implementation & unfortunately there are quite a few problems. This maybe due to my improper understanding. Could you please have a look? – SKM May 14 '15 at 19:33
  • So, I fitted a time series model to AR & estimated the parameters using OLS with the command aryule(). This gave me 2 parameters. Now, I tried estimating in presence of zero mean White Gaussian noise by varying the variance of the noise, which is sigma^2 (in my Question). Then how can I implement this covaraince formula S for each time instant n? : inv(sigma^2)*(A_1*A_1' - A_2*A_2') I do not follow what will the Toeplitz matrices, A_1 and A_2 will be in this case & the role of the estimates of the coefficients. From your example, it is easy to see that the estimates are used in the denominator – SKM May 14 '15 at 19:40
  • aryule() is not a built-in function, and I'm not sure what function you are using. But I can guess from the name that it has something to do with the Yule-Walker equations. These just let you invert between the autocovariance function implied by an autoregressive model and the autoregressive model implied by an autocovariance function. So are you actually looking for the estimated autocovariance function? This is NOT the same thing as the covariance matrix corresponding to the estimated autoregressive parameters. Anyway, I would suggest the thread be closed because this is not programming... – transversality condition May 18 '15 at 05:05