0

I am afraid I am stuck with an estimation problem.

I have two variables, X and Y. Y is explained by the weighted sum of n lagged values of X. My aim is to estimate the two parameters c(alpha0,alpha1) in:

Yt = Sum from j=1 to n of ( ( alpha0 + alpha1 * j ) * Xt-j )

enter image description here

Where Xt-j denotes the jth lag of X.

I came up with this approach because I thought it would be a good idea to estimate the slope of the weights rather than estimating one parameter for each lag of X added (I intent to set n very large).

To the model noise ut is added, which is assumed to be normally distributed with mean zero and standard deviation sigma.

Assuming I would like to set n=510, then I need the original series and 510 lagged series. In order to avoid any NAs in the series I transform the original data to "data_chopped", containing only the observations after the first 510 observations have been dropped, and the matrix "data_lagged" in which each column represents a lagged series:

library(stats)
data<-arima.sim(n=10000,list(ar=0.15,ma=0.1),mean=0.5)

data_chopped<-data[511:length(data)]

data_lagged<-matrix(nrow=length(data_chopped),ncol=510)
for (i in 1:510){
data_lagged[,i]<-head(data,-i)[(511-i):length(head(data,-i))]
}

#Check result:
cbind(data_chopped,data_lagged[,1:3])
#data_lagged[,1] is the first lag of the original data, data_lagged[,2] is the second lag, and so on. No NAs whatsoever to deal with 

To demonstrate the "working order" of my log-likelihood function and the generated series I first would like to fit an AR(3) model:

logl<-function(sigma,alpha,beta,gamma){
-sum(log((1/(sqrt(2*pi)*sigma)) * exp(-((
data_chopped
-alpha*data_lagged[,1]
-beta*data_lagged[,2]
-gamma*data_lagged[,3]
)^2)/(2*sigma^2))))
}

library(stats4)
mle(logl,start=list(sigma=1,alpha=0,beta=0,gamma=0),method="L-BFGS-B")

When I now try to estimate my model in the same way it just does not work. I never got a loop within a log-likelihood function to work, this is why I just wrote out the above model. So,

Yt = Sum from j=1 to n of ( ( alpha0 + alpha1 * j ) * Xt-j )

= (alpha+beta*1)*Xt-1 + (alpha+beta*2)*Xt-2 + (alpha+beta*3)*Xt-3 + ... + (alpha+beta*510)*Xt-510

logl<-function(sigma,alpha,beta){
-sum(log((1/(sqrt(2*pi)*sigma)) * exp(-((
data_chopped
-(alpha + beta*1)*data_lagged[,1]
-(alpha + beta*2)*data_lagged[,2]
-(alpha + beta*3)*data_lagged[,3]
-(alpha + beta*4)*data_lagged[,4]
-(alpha + beta*5)*data_lagged[,5]
...
-(alpha + beta*510)*data_lagged[,510]
)^2)/(2*sigma^2))))
}

library(stats4)
mle(logl,start=list(sigma=1,alpha=0.5,beta=0),method="L-BFGS-B")
Error in optim(start, f, method = method, hessian = TRUE, ...) : 
L-BFGS-B needs finite values of 'fn'

I do not get the error if I try just a very few lines:

logl<-function(sigma,alpha,beta){
-sum(log((1/(sqrt(2*pi)*sigma)) * exp(-((
data_chopped
-(alpha + beta*1)*data_lagged[,1]
-(alpha + beta*2)*data_lagged[,2]
-(alpha + beta*3)*data_lagged[,3]
-(alpha + beta*4)*data_lagged[,4]
-(alpha + beta*5)*data_lagged[,5]
)^2)/(2*sigma^2))))
}

library(stats4)
mle(logl,start=list(sigma=1,alpha=0.5,beta=0),method="L-BFGS-B")
Call:
mle(minuslogl = logl, start = list(sigma = 1, alpha = 0.5, beta = 0), 
method = "L-BFGS-B")

Coefficients:
 sigma      alpha       beta 
1.07797708  0.26178848 -0.04378526 

Can someone please help me out on this?

Chris437
  • 69
  • 1
  • 7
  • seems like you need to fit some `ARIMA` model or rather `MA` model... – agstudy Apr 01 '13 at 07:50
  • You probably need some noise in your model, `y[t] = ... + epsilon[t]`. Your log-likelihood should use the data, `y` and, if the noise is Gaussian, it should contain a sum of squares. Avoid the `lag` function: (unless you data are `xts` or `zoo` time series), it never does what you want. In your function, `logl` is `NULL`: you compute something inside the loop, but you do not do anything with the result, so it is discarded. – Vincent Zoonekynd Apr 01 '13 at 08:48
  • Hi Vincent! Thank you lot for your reply. I actually do deal with ts data here, so I guess I may keep the lags? So you asked me to include the residuals into the function. Is the SSR then actually what is to be minimized by optim? – Chris437 Apr 01 '13 at 16:05
  • The `lag` function is very dangerous: if your data is a `zoo`/`xts` time series, and if it stays in this format throughout the computations, that is fine -- but otherwise (if the data become a vector or a `ts` time series at some point) it will not work and you will spend a lot of time trying to understand why. – Vincent Zoonekynd Apr 01 '13 at 22:56
  • Yes. The likelihood of an observation is the density function evaluated on the residual of this observation, and the likelihood of the data is the product of those likelihoods. If the noise is Gaussian, the density function is of the form `exp(-epsilon^2/sigma^2)`: (minus) the log-likelihood is, up to a constant factor, the sum of the squared residuals. So that is what you should be minimizing. – Vincent Zoonekynd Apr 01 '13 at 22:57
  • Hi Vincent, Thank you a lot for your answer. Sorry, I thought by `xts` you meant usual ts type of data. Thank you for your hint that there is no way that a function with `lag` would not work. How would you suggest to avoid the `lag` function here? Thanks! – Chris437 Apr 02 '13 at 00:11
  • I have now added some noise to my model and found a way on how not to use the "lag"-function. Still I get an optimization error. Do you maybe have any idea on how to solve this? Any help is highly appreciated. – Chris437 Sep 10 '13 at 14:41

1 Answers1

1

I will echo the advice not to use the lag function. Its authors and early users may know what it does but the rest of us have had bad experience with it not delivering to expectations. I find the embed function useful for what I thought the lag function ought to do.

> embed(1:8, 3)
     [,1] [,2] [,3]
[1,]    3    2    1
[2,]    4    3    2
[3,]    5    4    3
[4,]    6    5    4
[5,]    7    6    5
[6,]    8    7    6

Let's say you want to look back at 6 times before the current time and do calculations by row. You need to accept and plan for the fact that it is now ambiguous what should be done with periods 1-6 since they will have incomplete data. I cannot figure out from your formula how one can estimate only two parameters when you have more than two lag periods, unless you apply some specific shape to the wear-off phenomenon .... linear perhaps ... you didn't say.

dfrm <- data.frame(y=rnorm(20), x=rnorm(20) )
dfrm$embx<- matrix(NA, ncol=7, nrow=20)
dfrm$embx[7:20, ] <- embed(dfrm$x, 7) * rep( (7:1)/7, each=14)
lm( y[7:20] ~ embx[7:20,], data=dfrm )

Call:
lm(formula = y[7:20] ~ embx[7:20, ], data = dfrm)

Coefficients:
  (Intercept)  embx[7:20, ]1  embx[7:20, ]2  embx[7:20, ]3  embx[7:20, ]4  embx[7:20, ]5  
       0.3065        -0.2371         0.9504         0.8601         0.5484         0.6621  
embx[7:20, ]6  embx[7:20, ]7  
       1.1619         4.8338  

This uses "full-strength" x_t and factors down to 1/7th strength for x_(t-7). That's a bit different than what your formula expressed since it did not have an x_t covariate but you should be able to construct a "slope" from the estimated coefficients.

IRTFM
  • 258,963
  • 21
  • 364
  • 487