Extracting the p-value from an lmerTest model inside a loop function

Question

I have a number of linear mixed models, which I have fitted with the lmerTest library, so that the summary() of the function would provide me with p-values of fixed effects.

I have written a loop function that extract the fixed effects of gender:time and gender:time:explanatory variable of interest.

Trying to now also extract the p-value of gender:time fixed effect (step 1) and also gender:time:explanatory variable (step 2).

Normally I can extract the p-value with this code:

coef(summary(model))[,5]["genderfemale:time"]

But inside the loop function it doesn't work and gives the error: "Error in coef(summary(model))[, 5] : subscript out of bounds"

See code

library(lmerTest)

# Create a list of models with interaction terms to loop over
models <- list(
  mixed_age_interaction,
  mixed_tnfi_year_interaction,
  mixed_crp_interaction
)

# Create a list of explanatory variables to loop over
explanatoryVariables <- list(
  "age_at_diagnosis",
  "bio_drug_start_year",
  "crp"
)

loop_function <- function(models, explanatoryVariables) {
  # Create an empty data frame to store the results
  coef_df <- data.frame(adj_coef_gender_sex = numeric(), coef_interaction_term = numeric(), explanatory_variable = character(), adj_coef_pvalue = numeric())
  
  # Loop over the models and explanatory variables
  for (i in seq_along(models)) {
    model <- models[[i]]
    explanatoryVariable <- explanatoryVariables[[i]]
    
    # Extract the adjusted coefficients for the gender*time interaction
    adj_coef <- fixef(model)["genderfemale:time"]
    
    # Extract the fixed effect of the interaction term
    interaction_coef <- fixef(model)[paste0("genderfemale:time:", explanatoryVariable)]
    
    # Extract the p-value for the adjusted coefficient for gender*time
    adj_coef_pvalue <- coef(summary(model))[,5]["genderfemale:time"]
    
    # Add a row to the data frame with the results for this model
    coef_df <- bind_rows(coef_df, data.frame(adj_coef_gender_sex = adj_coef, coef_interaction_term = interaction_coef, explanatory_variable = explanatoryVariable, adj_coef_pvalue = adj_coef_pvalue))
  }
  return(coef_df)
}

# Loop over the models and extract the fixed effects
coef_df <- loop_function(models, explanatoryVariables)
coef_df

My question is how can I extract the p-values from the models for gender:time and gender:time:explanatory variable and add them to the final data.frame coef_df?

Also adding a summary of one of the models for reference

Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: basdai ~ 1 + gender + time + age_at_diagnosis + gender * time +  
    time * age_at_diagnosis + gender * age_at_diagnosis + gender *  
    time * age_at_diagnosis + (1 | ID) + (1 | country)
   Data: dat

      AIC       BIC    logLik  deviance  df.resid 
 254340.9  254431.8 -127159.5  254318.9     28557 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.3170 -0.6463 -0.0233  0.6092  4.3180 

Random effects:
 Groups   Name        Variance Std.Dev.
 ID       (Intercept) 154.62   12.434  
 country  (Intercept)  32.44    5.695  
 Residual             316.74   17.797  
Number of obs: 28568, groups:  ID, 11207; country, 13

Fixed effects:
                                     Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)                         4.669e+01  1.792e+00  2.082e+01  26.048  < 2e-16 ***
genderfemale                        2.368e+00  1.308e+00  1.999e+04   1.810   0.0703 .  
time                               -1.451e+01  4.220e-01  2.164e+04 -34.382  < 2e-16 ***
age_at_diagnosis                    9.907e-02  2.220e-02  1.963e+04   4.463 8.12e-06 ***
genderfemale:time                   1.431e-01  7.391e-01  2.262e+04   0.194   0.8464    
time:age_at_diagnosis               8.188e-02  1.172e-02  2.185e+04   6.986 2.90e-12 ***
genderfemale:age_at_diagnosis       8.547e-02  3.453e-02  2.006e+04   2.476   0.0133 *  
genderfemale:time:age_at_diagnosis  4.852e-03  1.967e-02  2.274e+04   0.247   0.8052    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
            (Intr) gndrfm time   ag_t_d gndrf: tm:g__ gnd:__
genderfemal -0.280                                          
time        -0.241  0.331                                   
age_t_dgnss -0.434  0.587  0.511                            
gendrfml:tm  0.139 -0.519 -0.570 -0.293                     
tm:g_t_dgns  0.228 -0.313 -0.951 -0.533  0.543              
gndrfml:g__  0.276 -0.953 -0.329 -0.639  0.495  0.343       
gndrfml::__ -0.137  0.491  0.567  0.319 -0.954 -0.596 -0.516

If the code works outside the loop and you have coded your loop correctly, then that suggests there is an issue with one of your models. probably some missing input data or aliasing of model terms. But your example is not reproducible, so we can't be sure. Insert a `print` statement at the start of the `for` loop to identify which of the models has the problem. And then investigate further. As an aside, I believe that good maxim when coding in R is "If I'm thinking of using a `for` loop, there's probably a better way to do it". I suspect this may be a case in point. — Limey, Dec 19 '22 at 15:40
If you are calling the fifth coefficient with [,5] then "Subscript out of bounds" means that at least one of your models does not have five coefficients. If you are expecting five coefficients in every model, then one of your models probably did not run correctly, for whatever reason. — markyoung, Dec 19 '22 at 15:52
Thank you both for this very simple but working solution. The problem was in the third model, which hadn't run correctly, therefore, was out of bounds. Otherwise, the script works well. — Pashtun, Dec 19 '22 at 17:47

I_O · Answer 1 · 2022-12-19T16:55:02.017

The internal function get_coefmat of {lmerTest} might be handy:

if fm is an example {lmer} model ...

library("lmerTest")

fm <- lmer(Informed.liking ~ Gender + Information * Product +
             (1 | Consumer) + (1 | Consumer:Product),
           data=ham
           )

... you can obtain the coefficients including p-values as a dataframe like so (note the triple colon to expose the internal function):

df_coeff <- lmerTest:::get_coefmat(fm) |>
  as.data.frame()

output:

## > df_coeff
##                         Estimate Std. Error       df    t value     Pr(>|t|)
## (Intercept)            5.8490289  0.2842897 322.3364 20.5741844 1.173089e-60
## Gender2               -0.2442835  0.2605644  79.0000 -0.9375169 3.513501e-01
## Information2           0.1604938  0.2029095 320.0000  0.7909626 4.295517e-01
## Product2              -0.8271605  0.3453291 339.5123 -2.3952818 1.714885e-02
## Product3               0.1481481  0.3453291 339.5123  0.4290057 6.681912e-01
## ...

edit

Here's a snippet which will return you the extracted coefficents for, e.g., models m1 and m2 as a combined dataframe:

library(dplyr)
library(tidyr)
library(purrr)
library(tibble)

list('m1', 'm2') |> ## observe the quotes
  map_dfr( ~ list(
            model = .x,
            coeff = lmerTest:::get_coefmat(get(.x)) |>
              as.data.frame() |>
              rownames_to_column()
          )
          ) |> 
  as_tibble() |>
  unnest_wider(coeff)

output:


## + # A tibble: 18 x 7
##    model rowname               Estimate `Std. Error`    df `t value` `Pr(>|t|)`
##    <chr> <chr>                    <dbl>        <dbl> <dbl>     <dbl>      <dbl>
##  1 m1    (Intercept)              5.85         0.284 322.     20.6     1.17e-60
##  2 m1    Gender2                 -0.244        0.261  79.0    -0.938   3.51e- 1
##  ...
##  4 m1    Product2                -0.827        0.345 340.     -2.40    1.71e- 2
##  ...
##  8 m1    Information2:Product3    0.272        0.287 320.      0.946   3.45e- 1
##  ...
## 10 m2    (Intercept)              5.85         0.284 322.     20.6     1.17e-60
## 11 m2    Gender2                 -0.244        0.261  79.0    -0.938   3.51e- 1
## ...

Extracting the p-value from an lmerTest model inside a loop function

1 Answers1