0

According to this example, orthogonalization by projections is similar to taking the residuals of linear regressions on previous columns. However, when I try their example I am not obtaining the expected result. What is going on here? Why do I not obtain TRUE for the last three lines of code?

library(matlib)
data(class)

class$male <- as.numeric(class$sex=="M")   
X <- as.matrix(class[,c(3,4,2,5)])

Z <- cbind(X[,1], 0, 0, 0)
Z[,2] <- X[,2] - Proj(X[,2], Z[,1])
Z[,3] <- X[,3] - Proj(X[,3], Z[,1]) - Proj(X[,3], Z[,2]) 
Z[,4] <- X[,4] - Proj(X[,4], Z[,1]) - Proj(X[,4], Z[,2]) - Proj(X[,4], Z[,3])

z2 <- residuals(lm(X[,2] ~ X[,1]), type="response")
z3 <- residuals(lm(X[,3] ~ X[,1:2]), type="response")
z4 <- residuals(lm(X[,4] ~ X[,1:3]), type="response")

I was expecting to obtain Z[,2] = z2, Z[,3] = z3 and Z[,4] = z4, but it is not the case.

> all(Z[,2]==z2) [1] FALSE
> all(Z[,3]==z3) [1] FALSE
> all(Z[,4]==z4) [1] FALSE
A.P.
  • 461
  • 2
  • 8
  • 17

1 Answers1

0

This is because lm automatically adds the "intercept". Remove it (0 + ...) and there is equality:

z2 <- residuals(lm(X[,2] ~ 0 + X[,1]), type="response")
z3 <- residuals(lm(X[,3] ~ 0 + X[,1:2]), type="response")
z4 <- residuals(lm(X[,4] ~ 0 + X[,1:3]), type="response")

all.equal(z2, Z[,2])
# TRUE
all.equal(z3, Z[,3])
# TRUE
all.equal(z4, Z[,4])
# TRUE
Stéphane Laurent
  • 75,186
  • 15
  • 119
  • 225