I am trying to understand the results from AIC/BIC in R. For some reason R adds 1 to the number of parameters to be estimated. Hence R uses a different formula than 2 * p - 2 * logLik
(in Gaussian case logLik
is residual sum of squares). In fact it uses: 2 * (p + 1) - 2 * logLik
.
After a research, I found the problem is related to stats:::logLik.lm()
.
> stats:::logLik.lm ## truncated R function body
## ...
## attr(val, "df") <- p + 1
## ...
As a real example (using R's built-in dataset trees
), consider:
x <- lm(Height ~ Girth, trees) ## a model with 2 parameters
logLik(x)
## 'log Lik.' -96.01663 (df=3)
This is really puzzling. Anyone knows why?
Edit1: glm
examples by @crayfish44
model.g <- glm(dist ~ speed, cars, family=gaussian)
logLik(model.g) # df=3
model.p <- glm(dist ~ speed, cars, family=poisson)
logLik(model.p) #df=2
model.G <- glm(dist ~ speed, cars, family=Gamma)
logLik(model.G) #df=3
Edit2: methods of logLik
> methods(logLik)
[1] logLik.Arima* logLik.glm* logLik.lm* logLik.logLik* logLik.nls*