2

Suppose I have a dataset like (forget about the distribution):

modData <- data.frame("A" = rnorm(20, 15, 3),
                      "B" = rnorm(20, 20, 3),
                      "C" = rnorm(20, 25, 3),
                      "X" = rnorm(20, 5, 1)
                      )

If I use X as a predictor, A, Band C as responses, respectively:

md1 <- lm(A ~ X, data = modData)
md2 <- lm(B ~ X, data = modData)
md3 <- lm(C ~ X, data = modData)

Then do a Shapiro test and a boxcox test to every model, e.g. :

shapiro.test(residuals(md1))
boxcox(md1, plotit = T)

Is there a convenient way to build and test multiple models without manually typing each of them?

Tianjian Qin
  • 525
  • 4
  • 14
  • Thank you for providing a reproducible example and clear statement of what you're after. Novice visitors could learn a lot from questions like this one. – Roman Luštrik May 19 '18 at 11:24

4 Answers4

5

Here's an alternative approach using tidyverse:

modData <- data.frame("A" = rnorm(20, 15, 3),
                      "B" = rnorm(20, 20, 3),
                      "C" = rnorm(20, 25, 3),
                      "X" = rnorm(20, 5, 1))
library(tidyverse)
library(broom)

# specify predictor and target variables
x = "X"
y = names(modData)[names(modData)!= x]

expand.grid(y,x) %>%                                    # create combinations
  mutate(model_id = row_number(),                       # create model id
         frml = paste0(Var1, "~", Var2)) %>%            # create model formula
  group_by(model_id, Var1, Var2) %>%                    # group by the above
  nest() %>%                                            # nest data
  mutate(m = map(data, ~lm(.$frml, data = modData)),    # create models
         m_table = map(m, ~tidy(.)),                    # tidy model output
         st = map(m, ~shapiro.test(residuals(.)))) -> dt_model_info  # shapiro test

# access model info
dt_model_info
dt_model_info$m
dt_model_info$m_table
dt_model_info$st

# another way to access info
dt_model_info %>% unnest(m_table)
AntoniosK
  • 15,991
  • 2
  • 19
  • 32
2

Here is an approach by using simple "lapply":

# 1. Data set
df <- data.frame(
  a = rnorm(20, 15, 3),
  b = rnorm(20, 20, 3),
  c = rnorm(20, 25, 3),
  x = rnorm(20, 5, 1))

# 2. Models
fit_lm_a <- lm(a ~ x, df)
fit_lm_b <- lm(b ~ x, df)
fit_lm_c <- lm(c ~ x, df)

# 3. List of models
list_fit_lm <- list(fit_lm_a, fit_lm_b,fit_lm_c)

# 3. Shapiro test
lapply(
  list_fit_lm, function(x) {
    shapiro.test(residuals(x)) 
  })

 # 4. Box-Cox Transformations
 lapply(
  list_fit_lm, function(x) {
    boxcox(x, plotit = TRUE, data = df)
  }
 )
Andrii
  • 2,843
  • 27
  • 33
2

In case you wouldn't want to introduce dozens of dependencies, you can do it with a simple sapply. Notice that I don't provide a boxcox part because I don't know where it comes from (car, MASS?).

modData <- data.frame("A" = rnorm(20, 15, 3),
                      "B" = rnorm(20, 20, 3),
                      "C" = rnorm(20, 25, 3),
                      "X" = rnorm(20, 5, 1))

deps <- c("A", "B", "C")
indeps <- c("X")

result <- sapply(deps, FUN = function(x, indeps, mydata) {
  myformula <- formula(sprintf("%s ~ %s", x, indeps))

  model <- lm(myformula, data = mydata)
  out.shapiro <- shapiro.test(residuals(model))

  return(list(model = model, shapiro = out.shapiro))
}, indeps = indeps, mydata = modData, simplify = FALSE)
Roman Luštrik
  • 69,533
  • 24
  • 154
  • 197
1
data("mtcars")

formulas <- list(
  mpg ~ disp,
  mpg ~ disp + wt
)

res <- vector("list", length = length(formulas))

for(i in seq_along(formulas)){
  res[[i]] <- lm(formulas[[i]], data = mtcars)}
res

lapply(formulas, lm, data = mtcars)
Seyma Kalay
  • 2,037
  • 10
  • 22
  • While this code may provide a solution to the question, it's better to add context as to why/how it works. This can help future users learn, and apply that knowledge to their own code. You are also likely to have positive feedback from users in the form of upvotes, when the code is explained. – Amit Verma Jan 26 '21 at 12:17
  • You should provide an explanation and a rationale with an answer, otherwise you risk that this answer would be deleted. – Bas Jansen Jan 26 '21 at 15:09