Looping through columns to analyse different dependent variable

Question

Here my data frame (reproducible example)

   set.seed(42)  
n <- 6
dat <- data.frame(id=rep(1:n, 2), 
                  group= as.factor(rep(LETTERS[1:2], n/2)),
                  VD1 = rnorm(n),
                  VD2 = runif(n*2, min=0, max=100), 
                  VD3 = runif(n*2, min=0, max=100),
                  VD4 = runif(n*2, min=0, max=100),
                  VD5 = runif(n*2, min=0, max=100))

I am fitting the following mlm for one dependent variable "VD1"

> mlm_VD1  <- lmer(formula = VD1 ~ group + (1|id)
>                   , data = dat) 

    summary(mlm_VD1)

I would like to automatize the analyses of all the other dependent variables VD2, VD3, VD4, VD5 by creating a loop through all the columns of my dataframe (dat[, 4:ncol(dat)])

I would like then to save all the summaries of the different mlm (mlm_VD1, mlm_VD2, mlm_VD3, mlm_VD4, mlm_VD5) in a pdf file to read outside the R environment

Thanks!

jsv · Accepted Answer · 2021-03-19T17:19:38.207

Adding to the solution provided by akrun..

library(broom.mixed)
library(lme4)
library(purrr)

Indexing the columns as 3:7

var_names <- names(dat)[3:7]

output <- map_dfr(var_names,
                  function(x){
                    formula_mlm = as.formula(paste0(x,"~ group + (1|id)"));
                    model_fit = lmer(formula_mlm,data=dat) %>% 
                      tidy(.) %>% 
                      dplyr::mutate(variable = x);
                    return(model_fit)
                    
                  })
output %>% 
+   head(.)


   # A tibble: 6 x 7
  effect   group    term            estimate std.error statistic variable
  <chr>    <chr>    <chr>              <dbl>     <dbl>     <dbl> <chr>   
1 fixed    NA       (Intercept)      7.80e-1     0.223      3.50 VD1     
2 fixed    NA       groupB          -7.59e-1     0.315     -2.41 VD1     
3 ran_pars id       sd__(Intercept)  3.74e-1    NA         NA    VD1     
4 ran_pars Residual sd__Observation  2.10e-8    NA         NA    VD1     
5 fixed    NA       (Intercept)      7.91e+1    13.2        5.98 VD2     
6 fixed    NA       groupB          -2.97e+1    18.7       -1.59 VD2

Thank you once again! what if instead of defining manually c("VD1","VD2","VD3","VD4","VD5") , I would like just to give the column positions, and then i want the string of the header of these columns to appear in the column "variable" of the output? — Gianluca, Mar 19 '21 at 17:28

akrun · Answer 2 · 2021-03-19T16:56:24.133

1

We can use a loop. Subset the column names i.e. column names that starts with 'VD' followed by some digis, then loop over those 'nm1', create a formula with paste, apply lmer and get the summary

library(lme4)
nm1 <- grep('^VD\\d+', names(dat), value = TRUE)
out <- lapply(nm1, function(nm)
     summary(lmer(as.formula(paste(nm, '~ group + (1|id)')), data = dat)))

If it should be by position. Then use

i1 <- 3:7
out <- lapply(i1, function(i) 
        summary(lmer(as.formula(paste(names(dat)[i],
           '~ group + (1|id)')), data = dat)))

edited Mar 19 '21 at 16:56

answered Mar 19 '21 at 16:51

akrun

874,273
37
540
662

Thanks, how can I index the columns not by name but by position in the dataframe for nm1 – Gianluca Mar 19 '21 at 16:55
Thank you so much, it works. How can I save the summary table of the different mlm in a pdf – Gianluca Mar 19 '21 at 17:00
@Gianluca Did you meant the summary table as printed or as a data.frame ? – akrun Mar 19 '21 at 17:04
the summary table as printed – Gianluca Mar 19 '21 at 17:17
@Gianluca looks like you got another answer as accepted. So, probably it is solved – akrun Mar 19 '21 at 17:17

Looping through columns to analyse different dependent variable

2 Answers2