Enhanced likelihood ratio for comparison of models obtained with the lm() function

Question

I am looking for a function in R to compare models created with the lm() function, with the following statistics:

  Models df    AIC    BIC logLik    Test L.ratio p.value
1 model1  6 161.65 170.44 -74.83                        
2 model2  8 159.39 171.11 -71.69 1 vs. 2    6.26  0.0437
3 model3  5 159.77 167.10 -74.88 2 vs. 3    6.38  0.0945
4 model4  5 162.26 169.59 -76.13 3 vs. 4    2.49       0

Is there a function in R that returns the comparison table shown, specifically for models created with lm()? I obtained the table of comparisons with the following code in R:

dat <- read.csv2("https://raw.githubusercontent.com/saquicela/mineria/main/vapor_data.csv",sep=";",dec = ".", stringsAsFactors = TRUE)

model1 <- lm(y ~ x1 + x2 + x3 + x4, data = dat)
model2 <- lm(y ~ poly(x1,2) + x2 + poly(x3,2) + x4, data = dat)
model3 <- lm(y ~ x2 + x3 + x4, data = dat)
model4 <- lm(y ~ x1 + x2 + x4 , data = dat)

aic <- AIC(model1,model2,model3,model4)
bic <- BIC(model1,model2,model3,model4)

row_names <- rownames(aic)

library(lmtest)
likelihood_tests <- lrtest(model1,model2,model3,model4)

test_names <- c("",paste(seq(1,length(row_names)-1,1),
                             "vs.",
                             seq(2,length(row_names),1)))

models_comparison <- data.frame("Models" = row_names,
                                "df" = aic$df,
                                "AIC" = round(aic$AIC,2),
                                "BIC" = round(bic$BIC,2),
                                "logLik" = round(likelihood_tests$LogLik,2),
                                "Test" = test_names,
                                "L.ratio" = round(likelihood_tests$Chisq,2),
                                "p-value" = round(likelihood_tests$`Pr(>Chisq)`,4))

models_comparison[is.na(models_comparison)] <- ""
models_comparison

Thank you very much for your help.

`anova(model1, model2, model3, model4)` is not exactly what you are looking for but can it help? — Rui Barradas, Feb 15 '23 at 05:35

jay.sf · Answer 1 · 2023-02-15T06:43:38.923

Actually you just wrote the function. We could use formatC instead of round and add significance stars, also an invisible return, that is not rounded.

my_LR <- \(..., quiet=FALSE) {
  aic <- AIC(...)
  bic <- BIC(...)
  row_names <- sapply(substitute(...()), deparse)
  likelihood_tests <- lmtest::lrtest(...)
  test_names <- c("", paste(seq(1, length(row_names)-1, 1),
                            "vs.",
                            seq(2, length(row_names), 1)))
  models_comparison <- out <- data.frame(Models=row_names,
                                         df=aic$df,
                                         AIC=aic$AIC,
                                         BIC=bic$BIC,
                                         logLik=likelihood_tests$LogLik,
                                         Test=test_names,
                                         L.ratio=likelihood_tests$Chisq,
                                         p.value=likelihood_tests$`Pr(>Chisq)`
  )
  symp <- symnum(out$p.value[-1], corr=FALSE,
                 cutpoints=c(0, .001, .01, .05, .1, 1),
                 symbols=c("***", "** ", "*  ", ".  ", "   "))
  out[, c(3:5, 7:8)] <- mapply(\(x, y) formatC(x, digits=y, format='f'),
                               out[, c(3:5, 7:8)], c(2, 2, 2, 2, 4))
  out[1, 7:8] <- ''
  if (!quiet) print(cbind(out, ' '=c(' ', symp)))
  return(invisible(models_comparison))
}

Usage

res <- my_LR(model1, model2, model3, model4)
#   Models df    AIC    BIC logLik    Test L.ratio p.value    
# 1 model1  6 161.65 170.44 -74.83                            
# 2 model2  8 159.39 171.11 -71.69 1 vs. 2    6.26  0.0437 *  
# 3 model3  5 159.77 167.10 -74.88 2 vs. 3    6.38  0.0945 .  
# 4 model4  5 162.26 169.59 -76.13 3 vs. 4    2.49  0.0000 ***

res
#   Models df      AIC      BIC    logLik    Test  L.ratio    p.value
# 1 model1  6 161.6505 170.4449 -74.82524               NA         NA
# 2 model2  8 159.3886 171.1145 -71.69430 1 vs. 2 6.261884 0.04367663
# 3 model3  5 159.7684 167.0971 -74.88422 2 vs. 3 6.379839 0.09452374
# 4 model4  5 162.2623 169.5909 -76.13113 3 vs. 4 2.493821 0.00000000

If we don't want to print anything just return we can use quiet=FALSE.

Enhanced likelihood ratio for comparison of models obtained with the lm() function

1 Answers1

Usage