2

I am writing loops or functions in R, and I still haven't really understood how to do that. Currently, I need to write a loop/function (not sure which one would be better) to create several results of Mixed models within the same data frame.

sample dataset look like:

dataset <- read.table(text = 
"ID A_2 B_2 C_2 A_1 B_1 C_1 chkgp
M1  10  20  60  30  54  33  Treatment
M1  20  50  40  33  31  44  Placebo
M2  40  80  40  23  15  66  Placebo
M2  30  90  40  67  67  66  Treatment
M3  30  10  20  22  89  77  Treatment
M3  40  50  30  44  50  88  Placebo
M4  40  30  40  42  34  99  Treatment
M4  30  40  50  33  60  80  Placebo",header = TRUE, stringsAsFactors = FALSE)

I build a model to takeVariable _2 as the dependent variable and variable _1 as independent variable please have "lme4" package to run the mixed model

dep_vars<-grep("_2$",colnames(dataset),value = T) #This selects all variables ending in `_2` which should all be dependent variables.


#This removes the `_2` from the dependent variables which should give you the common stem which can be used to select both dependent and independent variables from your data frame.

reg_vars<-gsub("_2$","",dep_vars)

## To check that we have exact the correct variable which _2
dep_vars


[1] "A_2" "B_2" "C_2"

create a loop get all the result

full_results <- lapply(reg_vars, function(i) summary(lmer(paste0("log(",i,"_2)~",i,"_1+chkgp+(1|ID)"),data=dataset)))

to check the summary of the first model result

full_results[1]
[[1]]
Linear mixed model fit by REML ['lmerMod']
Formula: log(A_2) ~ A_1 + chkgp + (1 | ID)
   Data: dataset

REML criterion at convergence: 16.9

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-1.16981 -0.24161  0.04418  0.37744  0.95925 

Random effects:
 Groups   Name        Variance Std.Dev.
 ID       (Intercept) 0.1314   0.3625  
 Residual             0.1188   0.3446  
Number of obs: 8, groups:  ID, 4

Fixed effects:
                Estimate Std. Error t value
(Intercept)     3.293643   0.411924   7.996
A_1             0.004512   0.009844   0.458
chkgpTreatment -0.276792   0.253242  -1.093

Correlation of Fixed Effects:
            (Intr) A_1   
A_1         -0.795       
chkgpTrtmnt -0.068 -0.272

Question: I want to get the result of the chkgpTreatment std.error t value, P-value, upper CI and lower CI of each model and stored it in data frame like this


Depend_var  Indep_var mean difference
                                       p.value Upper ci Lower_ci
A_2 A_1 chkgpTreatment          
B_2 B_1 chkgpTreatment          
C_2 C_1 chkgpTreatment
vasili111
  • 6,032
  • 10
  • 50
  • 80
vka
  • 316
  • 1
  • 13
  • This question has a large amount of information in common with [your previous question](https://stackoverflow.com/questions/56120159/writing-loop-function-to-generate-various-linear-regressions-on-same-dataframe). If you are using parts of a previous question, you should refer to the previous question, with a link, such that others may have an easier time helping you out. – Oliver May 14 '19 at 18:14

1 Answers1

2

I think this produces the output you're looking for. To get p-values, I had to install the lmerTest package. This solution also uses the purrr and dplyr packages, so you'll need to install those if you want to use this.

library(lmerTest)
library(purrr)
library(dplyr)

dataset <- read.table(
   text =
      "ID A_2 B_2 C_2 A_1 B_1 C_1 chkgp
M1  10  20  60  30  54  33  Treatment
M1  20  50  40  33  31  44  Placebo
M2  40  80  40  23  15  66  Placebo
M2  30  90  40  67  67  66  Treatment
M3  30  10  20  22  89  77  Treatment
M3  40  50  30  44  50  88  Placebo
M4  40  30  40  42  34  99  Treatment
M4  30  40  50  33  60  80  Placebo",
   header = TRUE,
   stringsAsFactors = FALSE
)
#This selects all variables ending in `_2` which should all be dependent
#variables.
dep_vars <-
   grep("_2$", colnames(dataset), value = T) 


#This removes the `_2` from the dependent variables which should give you the
#common stem which can be used to select both dependent and independent
#variables from your data frame.

reg_vars <- gsub("_2$", "", dep_vars)

## To check that we have exact the correct variable which _2
dep_vars

full_results <-
   map(reg_vars, function(i) {
      summary(lmerTest::lmer(paste0("log(", i, "_2)~", i, "_1+chkgp+(1|ID)"),
                   data = dataset))
   })



results <- map_dfr(full_results, function(.x) {
   data.frame( 
      # extract indep_var name and replace "1" with "2"
      depend_var = paste0(substr(row.names(.x$coefficients)[2], 1, 2), "2"),
      # extract depend_var name
      indep_var = row.names(.x$coefficients)[2],
      # Get the coefficient associated with chkgpTrtmnt
      mean_difference = .x$coefficients[3, 1],
      # Get std. error
      se = .x$coefficients[3, 2],
      # Get p-value
      p.value = .x$coefficients[3, 5],
      # Calculate the CI by +/- 1.96 * the standard error
      lower_ci = (.x$coefficients[3, 1] - (1.96 * .x$coefficients[3, 4])),
      upper_ci = (.x$coefficients[3, 1] + (1.96 * .x$coefficients[3, 4]))
   )
})
Jake
  • 196
  • 1
  • 18
  • 1
    thanks for your answer @Jake but i am not getting the correct upper_ci and lower_ci and also i need the std error – vka May 14 '19 at 18:47
  • I edited my answer to add std error and fixed the CI calculation. – Jake May 14 '19 at 18:50
  • Thank you so much I have question why we need to use LmerTest instead of lme4 because I am getting a slight difference in the p-value and confidence interval while using LmerTest @Jake – vka May 14 '19 at 19:11
  • The version of lme4 I use (1.1-20) does not provide p-values or confidence intervals so I had to use the lmerTest package, which does. Maybe different versions of lme4 gave p-values? Otherwise, there is some inherent randomness in these methods since they are solved numerically so slight differences are to be expected. – Jake May 14 '19 at 19:12
  • thanks for the clarification could you please confidence interval once again i am getting a really different value – vka May 14 '19 at 19:36
  • What do you mean you are getting very different values? Considering the size of the standard error and the low number of observations, the confidence intervals will be rather large. It's possible your lmer function is calculating confidence intervals differently from mine. – Jake May 15 '19 at 14:39
  • I have questioned if I want to select an independent variable as A_1,A_2 and run the model with another dependent variable how can I do it – vka Jun 26 '19 at 01:26