1

I am trying to test a whole bunch of different models easily and compare AIC / R-sq values to select the right one. I am having some trouble saving things how I want to between lists and data frames.

data frame I am going to model:

set.seed(1)
df <- data.frame(response=runif(50,min=50,max=100),
                 var1 = sample(1:20,50,replace=T),
                 var2 = sample(40:60,50,replace = T))

list of formulas to test:

formulas  <- list( response ~ NULL,
                   response ~ var1,
                   response ~ var2,
                   response ~ var1 + var2,
                   response ~ var1 * var2)

So, what I want to do is create a loop that will model all of these formulas, extract Formula, AIC, and R-sq values into a table, and let me sort it to find the best one. The problem I'm having is I can't extract the formula name as "Response ~ var1", instead, it keeps coming out as "Response" "~" "var1" if I try to extract as a character object. Or, if I extract as a list (like below), then it comes out like this:

[[1]]
response ~ NULL

[[2]]
[1] 415.89

[[3]]
[1] 0

And I can't easily plug those list elements into a data frame. Here is what I tried:

selection <- matrix(ncol=3)
colnames(selection) <- c("formula","AIC","R2") # create a df to store results in
for ( i in 1:length(formulas)){
  mod <- lm( formula = formulas[[i]], data= df)
  mod_vals <- c(extract(formulas[[i]]), 
                round(AIC(mod),2), 
                round(summary(mod)$adj.r.squared,2)
  )
  selection[i,] <- mod_vals[]
}

Any ideas? I don't have to keep it as a for loop either, I just want a way to test a long list of models together.

Thanks.

Jake L
  • 987
  • 9
  • 21
  • Does this answer your question? [R: repeat linear regression for all variables and save results in a new data frame](https://stackoverflow.com/questions/58949703/r-repeat-linear-regression-for-all-variables-and-save-results-in-a-new-data-fra) – UseR10085 Feb 10 '20 at 07:47
  • 2
    Does this answer your question? [How to convert R formula to text?](https://stackoverflow.com/questions/14671172/how-to-convert-r-formula-to-text) – Benjamin Schwetz Feb 10 '20 at 07:48

1 Answers1

3

You could use lapply to loop over each formula and extract relevant statistic from the model and bind the datasets together.

do.call(rbind, lapply(formulas, function(x) {
   mod <- lm(x, data= df)
   data.frame(formula = format(x), 
              AIC = round(AIC(mod),2), 
              r_square = round(summary(mod)$adj.r.squared,2))
}))

#                formula    AIC r_square
#1        response ~ NULL 405.98     0.00
#2        response ~ var1 407.54    -0.01
#3        response ~ var2 407.90    -0.02
#4 response ~ var1 + var2 409.50    -0.03
#5 response ~ var1 * var2 410.36    -0.03

Or with purrr

purrr::map_df(formulas, ~{
  mod <- lm(.x, data= df)
 data.frame(formula = format(.x), 
            AIC = round(AIC(mod),2), 
            r_square = round(summary(mod)$adj.r.squared,2))
})
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213