Edit
I found some better information about this issue.
You can get NA
coefficients if the predictors are perfectly correlated. This seems to be an unusual case since we only have one predictor. So in this case, dt
appears to be linearly related with the intercept.
We can find linearly dependent variables using alias
. See https://stats.stackexchange.com/questions/112442/what-are-aliased-coefficients
In the first example
test<-data.frame(dt=c(36996616, 36996620, 36996623, 36996626), value=c(1,2,3,4))
fit1 <- lm(value ~ dt, test)
alias(fit1)
Model :
value ~ dt
No linearly dependent terms. But in the second example
test$dt <- test$dt + 1
fit2 <- lm(value ~ dt, test)
alias(fit2)
Model :
value ~ dt
Complete :
[,1]
dt 147986489/4
Which appears to show a linearly dependent relationship between dt
and the intercept
.
Additional information on how lm
deals with a reduced-rank model: https://stat.ethz.ch/pipermail/r-help/2002-February/018512.html.
lm
does not invert X'X https://stat.ethz.ch/pipermail/r-help/2008-January/152456.html, but I still think the below is helpful to show the singularity of X'X.
x <- matrix(c(rep(1, 4), test$dt), ncol=2)
y <- test$value
b <- solve(t(x) %*% x) %*% t(x) %*% y
Error in solve.default(t(x) %*% x) :
system is computationally singular: reciprocal condition number = 7.35654e-30
The default tol
in lm.fit
is 1e-7, which is the tolerance for computing linear dependencies in qr
decomposition.
qr(t(x) %*% x)$rank
[1] 1
If you reduce this, you will get a parameter estimate for dt
.
# decrease tol in qr
qr(t(x) %*% x, tol = 1e-31)$rank
[1] 2
# and in lm
lm(value~dt, test, tol=1e-31)$coefficients
(Intercept) dt
-1.114966e+07 3.013699e-01
See https://stats.stackexchange.com/questions/86001/simple-linear-regression-fit-manually-via-matrix-equations-does-not-match-lm-o for details on the matrix algebra in simple linear regression.