R: Getting AIC/BIC/Likelihood from GLMNet

Question

I'm wondering if I can get AIC and BIC from GLMNet. I've found glmnet.cr that seems to be able to do it but my response is time, not ordinal. I could calculate it myself from likelihood but glmnet doesn't return that either.

Tangential: can I actually return the l1norm? I feel like it should just be

fit$norm

but it doesn't seem to be. (I know it says don't pull the numbers out but I'm actually not using the R)

Thanks in advance for the help.

The short answer is no but there might be a way around it depending what you're trying to do. I presume you are using the `glmnet` R package, but we need to know more about your model. For example, does it have a Gaussian response? There are some answers on Cross-Validated that might help. — Peter Ellis, Jan 22 '17 at 04:00

score 8 · Accepted Answer · answered May 02 '17 at 20:45

8

I was struggling a lot with a way how to calculate AIC and BIC for glmnet models. However, after quite a lot of searching, I found on the third page of google results the answer. It can be found here. I am posting it here for future readers as I believe I cannot be the only one.

In the end, I implemented the AIC and BIC in the following way:

fit <- glmnet(x, y, family = "multinomial") 

tLL <- fit$nulldev - deviance(fit)
k <- fit$df
n <- fit$nobs
AICc <- -tLL+2*k+2*k*(k+1)/(n-k-1)
AICc

BIC<-log(n)*k - tLL
BIC

answered May 02 '17 at 20:45

johnnyheineken

543
7
20

3

So I ended up using this as well, but from what I read (and calculations other places) it will not give you the exact same value for 2L (that you would normally use). But since fit$nulldev is a constant, it can be used as a replacement. Just a caveat in case you're wondering why your AICc values aren't lining up with other calculations. – Faller May 11 '17 at 15:27

Ryan SY Kwan · Answer 2 · 2020-05-18T08:35:46.850

Based on @merten's answer, I fixed the formula. Now it matches the built-in function.

Summary,

the original log-likelihood (tLL) was bias.
AIC and AICc from the built-in functions were added for comparison.

BICAICglm=function(fit){
  #tLL <- fit$null.deviance - deviance(fit)  
  tLL <- -deviance(fit) # 2*log-likelihood
  k <- dim(model.matrix(fit))[2]
  n <- nobs(fit)
  AICc <- -tLL+2*k+2*k*(k+1)/(n-k-1)
  AIC_ <- -tLL+2*k

  BIC<-log(n)*k - tLL
  res=c(AIC_, BIC, AICc)
  names(res)=c("AIC", "BIC", "AICc")
  return(res)
}
#some data simulation to test
set.seed(123)
x=rnorm(20)
set.seed(231)
y=as.numeric(x+rnorm(20)<0)

#the model
glm1=glm(y~x, family="binomial")

Results

BICAICglm(glm1)
     AIC      BIC     AICc 
21.91018 23.90165 22.61607

Answers according to built-in functions

AIC(glm1)
[1] 21.91018
BIC(glm1)
[1] 23.90165

AICc correction for small sample sizes

AIC(glm1, k=2*nobs(glm1)/(nobs(glm1)-1-glm1$rank))
[1] 22.61607

score 0 · Answer 3 · answered May 02 '19 at 08:17

Unfortunatly I cannot reproduce the BIC using this formula for from a "normal" glm model (for which the built-in function BIC works as a correct reference)

I changed the code proposed above to make it work with a glm object:

    #BIC function for glm according to stackoverflow
    BICAICglm=function(fit){
      tLL <- fit$null.deviance - deviance(fit)
      k <- dim(model.matrix(fit))[2]
      n <- nobs(fit)
      AICc <- -tLL+2*k+2*k*(k+1)/(n-k-1)
      AICc

      BIC<-log(n)*k - tLL
      res=c(AICc, BIC)
      names(res)=c("AICc", "BIC")
      return(res)
    }

    #some data simulation to test
    set.seed(123)
    x=rnorm(20)
    set.seed(231)
    y=as.numeric(x+rnorm(20)<0)

    #the model
    glm1=glm(y~x, family="binomial")

now when we apply the standard BIC() function we get the "true" BIC of the model, that we then can compare with the function proposed here.

    BIC(glm1)
    [1] 23.68755

and the new one:

    BICAICglm(glm1)
    AICc       BIC 
    -4.518496 -3.232914

So the way of calculating BIC and AICc this way is not quite right.

R: Getting AIC/BIC/Likelihood from GLMNet

3 Answers3

Linked