Creating a for loop to calculate AIC scores for different models using lm

Question

Im trying to create AIC scores for several different models in a for loop. I have created a for loop with the log likeliness for each model. However, I am stuck to create the lm function so that it calculates a model for each combination of my column LOGABUNDANCE with columns 4 to 11 of my dataframe. This is the code I have used so far. But that gives me a similar AIC score for every model.

# AIC score for every model 
LL <- rep(NA, 10)
AIC <- rep(NA, 10)

for(i in 1:10){
  mod <- lm(LOGABUNDANCE ~ . , data = butterfly)
  sigma = as.numeric(summary(mod)[6])
  LL[i] <- sum(log(dnorm(butterfly$LOGABUNDANCE, predict(mod), sigma)))
  AIC[i] <- -2*LL[i] + 2*(2)
}

How about you subset your columns of interest for each regression, inside the loop, before `mod <- lm...` ? — Yacine Hajji, Mar 18 '22 at 10:24

score 0 · Answer 1 · answered Mar 18 '22 at 10:48

You get the same AIC for every model, because you create 10 equal models.

To make the code work, you need some way of changing the model in each iteration.

I can see two options:

Either subset the data in the start of each iteration so it only contains LOGABUNDANCE and one other variable (as suggested by @yacine-hajji in the comments), or
Create a vector of the variables you want to create models with, and use as.formula() together with paste0() to create a new formula for each iteration.

I think solution 2 is easier. Here is a working example of solution 2, using mtcars:

# AIC score for every model 
LL <- rep(NA, 10)
AIC <- rep(NA, 10)

# Say I want to model all variables against `mpg`:

# Create a vector of all variable names except mpg
variables <- names(mtcars)[-1]

for(i in 1:10){

  # Note how the formula is different in each iteration
  mod <- lm(
    as.formula(paste0("mpg ~ ", variables[i])),
    data = mtcars
    )

  sigma = as.numeric(summary(mod)[6])
  LL[i] <- sum(log(dnorm(mtcars$mpg, predict(mod), sigma)))
  AIC[i] <- -2*LL[i] + 2*(2)
}

Output:

AIC
#>  [1] 167.3716 168.2746 179.3039 188.8652 164.0947 202.6534 190.2124 194.5496
#>  [9] 200.4291 197.2459

Creating a for loop to calculate AIC scores for different models using lm

1 Answers1