0

Struggling with R.

My linear lm() Model contains Variables that are differenced via diff() and variables which are not differenced. The differenced variables are one observation shorter due to differencing. Therefore, the lm() gives the error message of different variable lengths.

My idea of a solution to this error is, to somehow define the variables as time series (which they are anyway, but R doesnt know that) and then tell the lm-Model exactly, which years to use (yearly data).

In my understanding, after differencing, a time-series looses its first observation, so, as I use the ts()-Funktion, I shall set a starting year one year later for the differenced function.

More concrete: lets say I imported the variables x and y then I go

dx<-diff(x, lag=1, differences=1) 

while y remains the same

lm(y~dx) 

will then produce the said error.

Lets say x and y both start in 1900. Then dx starts in 1901, so that lm has to start in 1901 for all Variables. My idea is, as stated above, to explicitly make both variables a time-series

tsdx<-ts(dx, frequency=1, start=1901)
tsy<-ts(y, frequency=1, start:1900) 

and then somehow tell lm() to start with the year 1901.

Is this a good way to deal with these Problems? And how would I code the last step? Thanks alot!

Pizza-Fan
  • 1
  • 2
  • lm(y~dx,na.omit) should work (?). And one thought, unless you're trying to replicate a DF-test or so, the response and predictor probably should both be differenced. – r.user.05apr May 23 '17 at 13:26
  • Thank you, it works! Probably should have seen that. About your last remark: Actually it is a large error-correction-model. I just wanted to focus on this problem and therefore invented this x-y model. – Pizza-Fan May 23 '17 at 14:51

0 Answers0