8

I can't quite figure out why I'm having trouble calculating the Bayesian Information Criterion correctly, and was hoping someone could point me in the right direction.

I am doing this because I am trying to calculate the BIC manually (for plm objects, which don't seem to have an established routine associated with them). I took the formula from the Wikipedia page, which gives the formula for the BIC in terms of the Residual Sum of Squares, rather than the Log Likelihood.

y<-rnorm(100)
x<-rnorm(100)
m.test<-lm(y ~ x) 

n<-100 
rss<-sum(m.test$residuals^2) 
k<-3 
bic.mine<-n*log(rss/n)+k*log(n) #formula from wikipedia
bic.stats<-BIC(m.test) #using stats package
abs(bic.mine-bic.stats) #mine is off!

Running the code a bunch of times, I realize that the difference between the BIC I obtain and the BIC obtained from the stats package is constant, so my suspicion is that I'm missing some kind of scaling factor. Is that right? Thanks in advance.

Edit: Thanks for all the comments. I tried to implement the suggestions and post an answer, but I'm still off by a constant. Revised code below.

y<-rnorm(100)
x<-rnorm(100)
m.test<-lm(y ~ x) 

n<-100
res<-m.test$residuals
rss<-sum(res^2) 
k<-3; df<-n-k; w<-rep(1,N) #params, dfs, weights
ll<-0.5 * (sum(log(w)) - n *
             (log(2 * pi) + 1 - log(n) + log(sum(w * res^2))))
ll.stats<-logLik(m.test)
abs(ll.stats-ll)==0 #same, prob is not here

bic.mine<-n*log(rss/n)+k*log(n) #formula from wikipedia
bic.exact<- -2 * ll + log(n) * df #suggestions from comments
bic.stats<-BIC(m.test) #using stats package
abs(bic.mine-bic.stats) #mine was off
abs(bic.exact-bic.stats) #this is still off, though
daanoo
  • 771
  • 5
  • 18
  • 2
    Heh, I like this question :) I think the most straightforward way is to look at the source code of `BIC`. With `methods(BIC)` you can see that there are two methods. I presume `BIC.default` gets called, but if you type `BIC.default` you'll get an error. Use `stats:::BIC.default` instead to see the source code. Here you can see exactly how the BIC is calculated and compare with your own script. I hope this helps at least a little bit. – slamballais Feb 01 '16 at 13:37
  • 2
    @Laterow in general use `getAnywhere("BIC.default")` in case you don't know the package – MichaelChirico Feb 01 '16 at 13:51
  • 2
    from `?BIC` "BIC is defined as `AIC(object, ..., k = log(nobs(object)))`"; as to AIC, a citation is given for Sakamoto, Ishiguro and Kitagawa's _Akaike Imformation Criterion Statistics_ – MichaelChirico Feb 01 '16 at 13:54
  • 2
    And in `BIC.default` we find the formula: `-2 * as.numeric(lls) + log(nos) * attr(lls, "df")`, with `lls` the log likelihood of the object, `nos` the number of observations, and `df` the degrees of freedom. – MichaelChirico Feb 01 '16 at 13:56
  • 3
    again using `getAnywhere` to find `logLik.lm`, we see `lls` is given by `0.5 * (sum(log(w)) - N * (log(2 * pi) + 1 - log(N) + log(sum(w * res^2))))`; given you've done unweighted OLS, `w <- rep(1, N)`. `res` is the vector of residuals. – MichaelChirico Feb 01 '16 at 14:01
  • 3
    The formula you are using does not give the exact value of the BIC, but only "up to an additive constant, which depends only on n and not on the model" (see the wikipedia page you linked to). The exact value of the BIC is given by -2*logLmax + 3 * log(100) and the maximum log likelihood value for your problem is given by the formula @MichaelChirico provided in his comments (see https://en.wikipedia.org/wiki/Maximum_likelihood for more details) – konvas Feb 01 '16 at 14:25
  • 1
    someone (possibly the OP) should assemble an answer from all of these comments ... – Ben Bolker Feb 01 '16 at 18:28
  • Thanks for the help. I did my best, but couldn't make an answer work.. Can anyone spot what's off? – daanoo Feb 02 '16 at 12:31

1 Answers1

12

Thanks to help from the commenters, here's the answer:

y<-rnorm(100)
x<-rnorm(100)
m<-lm(y ~ x) 

To get the BIC or AIC, you first need the associated log likelihood.

Calculating the log likelihood requires a vector of residuals, the number of observations in the data, and a vector of weights (if applicable)

res<-m$residuals
n<-nrow(m$model)    
w<-rep(1,n) #not applicable

ll<-0.5 * (sum(log(w)) - n * (log(2 * pi) + 1 - log(n) + log(sum(w * res^2))))
ll-logLik(m)==0 #TRUE

Calculating the BIC or AIC requires ll, and additionally requires the df associated with the calculation of the log likelihood, which is equal to the original number of parameters being estimated plus 1.

k.original<-length(m$coefficients)
df.ll<-k.original+1 
bic<- -2 * ll + log(n) * df.ll
bic-BIC(m)==0 #TRUE
aic<- -2 * ll + 2 * df.ll
aic-AIC(m)==0 #TRUE
daanoo
  • 771
  • 5
  • 18
  • 2
    `fit$rank` does not exist for `plm` methods. I suggest using `length(m$coefficients)` instead. – altabq Jul 03 '19 at 13:36