1

I'm new on R, I've always use e-views and now I have to do a regression

I have Xt stationary and Yt not stationary so I need to difference

yt=Yt-Y(t-1)

then the regression is

yt = a + bXt

how can I do the forecast on R and obtain the "real" value and not the differences?

in e-views is enouth to write d(Yt) but in R is impossible

Sumit Bijvani
  • 8,154
  • 17
  • 50
  • 82
enk
  • 31
  • 1
  • 2
  • 4
  • possible duplicate of [Adding lagged variables to an lm model?](http://stackoverflow.com/questions/13096787/adding-lagged-variables-to-an-lm-model) – Andrie Oct 05 '13 at 16:51
  • 1
    my dependent variable (`yt`) is not lagged, it is differenced – enk Oct 05 '13 at 17:07
  • You can use the function `diff()` to calculate the differences. – Andrie Oct 05 '13 at 17:12
  • sure but when I do the forecast, the function give me the differenced forecast value – enk Oct 05 '13 at 17:14

3 Answers3

5

I think what you are looking for is cumsum, which is the inverse operation of diff (almost). You can recover a vector from its differences like this:

> z<-sample(20)
> dz<-diff(z)
> z0<-cumsum(c(z[1],dz))
> all(z==z0)
[1] TRUE

In your case, it would look something like this:

dY<-diff(Y)
dYhat<-lm(dY ~ X[-1])$fitted
Yhat<-cumsum(c(Y[1],dYhat))
mrip
  • 14,913
  • 4
  • 40
  • 58
4

First, if you think X affects y, then you should difference both variables. Differencing only one of them leads to a model where X affects the change in y rather than y itself.

You could do this with the arima() function (and differencing both variables):

fit <- arima(y, xreg=X, order=c(0,1,0))

Then predictions on the undifferenced scale are obtained using

fcast <- predict(fit, n.ahead=10, newxreg=futureX)

where futureX contains the next 10 values of X.

If you really wanted to model the affect of X on d(y), then create a new variable

sumX <- cumsum(X)

and use that instead of X in the fit (and similarly modify futureX).

Rob Hyndman
  • 30,301
  • 7
  • 73
  • 85
0

If you forecast the first difference h many days ahead, you have an estimate of how higher or lower the series will be in h many days. You just to add the last observed value to this difference to get the level:

dy_{t}:= y_{t} - y_{t-1}

You forecast dy_{T+h}, you know the last observation, y_{T}, so y_{T+h} = dy_{T+h} + y_{T}. Just augment all forecasts of the future value by the last observation and you get a level forecast.

If you used the first difference of logarithms, dlny_{t} := lny_{t} - lny_{t-1}, you have a first order Taylor approximation to a growth rate, so you'd forecasting growth rates. In that case, level forecasts would be y_{T+h} = (1+dlny_{T+h}) y_{T}.

You don't need any complicated function to do those transformations. In fact, because y_{T} is a scalar, if you have a vector for dy_{T+h} or dlny_{T+h} for all h, R will allow you to write the expressions I wrote above using scalars and vectors and R will manage dimension problems correctly on its own.

Kaustubh Khare
  • 3,280
  • 2
  • 32
  • 48
Stéphane
  • 187
  • 1
  • 9