1

I was playing around with doing a manual calculation for the OLS estimators using linear algebra in R and I got a different answer to R's inbuilt regression function lm(). Would anyone be able to tell me why there is a difference? Is R not performing OLS?

> x1<-rbind(1,2,3,4,5)
> x2<-rbind(3,65,7,2,1)
> x3<-rbind(34,7,23,2,4)
> x4<-rbind(25,50,70,90,110)
> y<-rbind(1,2,3,4,5)
> X<-as.matrix(cbind(x1,x2,x3,x4))
> Y<-as.matrix(cbind(y))
> 
> beta.hat<-solve(t(X)%*%X)%*%t(X)%*%Y
> r.regression<-lm(Y~0+X)
> 
> beta.hat
              [,1]
[1,]  1.000000e+00
[2,] -2.595146e-15
[3,]  8.174017e-15
[4,] -2.309264e-14
> r.regression

Call:
lm(formula = Y ~ 0 + X)

Coefficients:
        X1          X2          X3          X4  
 1.000e+00   3.331e-18   4.152e-17  -6.783e-17  
IRTFM
  • 258,963
  • 21
  • 364
  • 487
1212__Hello
  • 545
  • 1
  • 6
  • 13
  • Doesn't it worry you a bit that all of your coefficients with the exception of the one for X1 are effectively zero? – IRTFM Dec 11 '13 at 07:39
  • The data is just made up so I could test the code so I'm not too concerned with the value of the coefficients themselves, I'm more interested as to why there is a difference between the two methods. – 1212__Hello Dec 11 '13 at 07:42
  • 2
    They both gave you effectively the same answer, although the lm() method had higher accuracy than your method. Perhaps you should read the code and then read up on comparisons of numerical accuracy of methods of matrix inversion. This is really off-topic here. – IRTFM Dec 11 '13 at 07:47

1 Answers1

0

R is performing OLS alright, the problem is in the example you provide. Here is a demonstration, building on what @DWin has commented.

set.seed(1234)
x1 <- rnorm(5,mean=3)
x2 <- rnorm(5,mean=1,sd=5)
x3 <- rnorm(5,mean=7,sd=1)
x4 <- rnorm(5,mean=1,sd=2)

X<-as.matrix(cbind(x1,x2,x3,x4))
Y<-as.matrix(cbind(y))

(beta.hat<-solve(t(X)%*%X)%*%t(X)%*%Y)
lm(Y~X+0)

As you can see, the coefficients are exactly the same, and your code is repeated except for the more suitable data.

P.S. If this is classified as an off-topic question, feel free to delete the answer as well. My intention was just to illustrate the issue with some code, which doesn't fit in a commentary.

Maxim.K
  • 4,120
  • 1
  • 26
  • 43