4

I am trying to create multiple linear regression models from a list of variable combinations (I also have them separately as a data-frame if that is more useful!)

The list of variables looks like this:

Vars
x1+x2+x3
x1+x2+x4
x1+x2+x5
x1+x2+x6
x1+x2+x7

The loop I'm using looks like this:

for (i in 1:length(var_list)){
  lm(independent_variable ~ var_list[i],data = training_data)
  i+1
}

However it is not recognizing the string of var_list[i] which gives x1+x2+x3 etc. as a model input.

Does any-one know how to fix it?

Thanks for your help.

C L
  • 105
  • 3
  • 11

2 Answers2

4

You don't even have to use loops. Apply should work nicely.

training_data <- as.data.frame(matrix(sample(1:64), nrow = 8))
colnames(training_data) <- c("independent_variable", paste0("x", 1:7))

Vars <- as.list(c("x1+x2+x3",
                "x1+x2+x4",
                "x1+x2+x5",
                "x1+x2+x6",
                "x1+x2+x7"))

allModelsList <- lapply(paste("independent_variable ~", Vars), as.formula)
allModelsResults <- lapply(allModelsList, function(x) lm(x, data = training_data))  

If you need models summaries you can add :

allModelsSummaries = lapply(allModelsResults, summary) 

For example you can access the coefficient R² of the model lm(independent_variable ~ x1+x2+x3) by doing this:

allModelsSummaries[[1]]$r.squared

I hope it helps.

DjibSA
  • 161
  • 1
  • 6
3

We can create the formula with paste

out <- vector('list', length(var_list))

for (i in seq_along(var_list)){
  out[[i]] <- lm(paste('independent_variable',  '~', var_list[i]),
               data = training_data)
 }

Or otherwise, it can be done with reformulate

lm(reformulate(var_list[i], 'independent_variable'), data = training_data)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thank you for the quick answer. For the first option it says: object independent_variable not found. And for the second option it says: 'termlabels' must be a character vector of length at least one'. – C L Dec 03 '19 at 16:22
  • @CL It is inside the `for` loop. Also, I didn't get the last part of your mesage – akrun Dec 03 '19 at 16:22
  • I'm running the code in the same for loop as previously: ` for (i in 1:length(var_list)){ lm(paste(independent_variable, "~", var_list[i]),data = training_data) i+1 } ` – C L Dec 03 '19 at 16:23
  • @CL I thought it is an object name. I guess, it shoould be quoted if it is a column name. Updated the post – akrun Dec 03 '19 at 16:24
  • Perfect, it runs now but doesn't seem to produce any visible output. Is there something I'm missing? Thanks again for your help with this. – C L Dec 03 '19 at 16:43
  • @CL You are not storing the output. Please check the update – akrun Dec 03 '19 at 16:47
  • 1
    Brilliant it works really well. Thanks for the quick reply and helpful updates. – C L Dec 03 '19 at 16:51