0

I use a mixed model to separate the variance of some measurement into between- and within group variance. The model looks like this:

library(lme4)

k <- 100 
i <- 5

group_mean <- rnorm(k, 0, 1)
y <- rnorm(k*i, rep(group_mean, each = i), 0.5)

df <- data.frame(y, group = rep(1:k, each = i))
    
m <- lmer(y ~ (1|group), data = df)

I can extract between- and within group SD and confidence intervals:

VarCorr(m)
#>  Groups   Name        Std.Dev.
#>  group    (Intercept) 0.99958 
#>  Residual             0.50327
confint(m, oldNames = FALSE)
#> Computing profile confidence intervals ...
#>                           2.5 %     97.5 %
#> sd_(Intercept)|group  0.8640826 1.15724604
#> sigma                 0.4703110 0.54026452
#> (Intercept)          -0.3782316 0.02526364

How can I get confidence intervals for the sum of the variances (sd_(Intercept)|group^2 + sigma^2)? I.e. the estimated variance in y with only one observation per group.

JohannesNE
  • 1,343
  • 9
  • 14
  • 1
    You could bootstrap it? – Roland Jun 03 '21 at 09:13
  • That could be the solution. I can see that confint() has the option to supply a custom boot function. – JohannesNE Jun 03 '21 at 09:25
  • 1
    I would look into using `bootMer`. – Roland Jun 03 '21 at 09:37
  • After reading a bit of documentation, I can see that `confint()` can pass a model and custom function to `bootMer()`, and then calculate CI from the resulting bootstrap estimates (`confint(m, method = 'boot', FUN = get_total_sd, nsim = 10000)`). Works quite nicely, though it takes a few minutes for 10000 resamples of this simple dataset. Not unreasonable for my usecase though. – JohannesNE Jun 03 '21 at 09:43
  • You are encouraged to self-answer your question based on this. – Roland Jun 03 '21 at 10:04

1 Answers1

1

As commented by @Roland, it is possible to bootstrap this. lme4 provides a nice framework for this:

library(lme4)
#> Loading required package: Matrix

k <- 100 
i <- 5

group_mean <- rnorm(k, 0, 1)
y <- rnorm(k*i, rep(group_mean, each = i), 0.5)

df <- data.frame(y, group = rep(1:k, each = i))

m <- lmer(y ~ (1|group), data = df)

get_total_sd <- function(mod) {
    varCorr_df <- as.data.frame(VarCorr(mod))
    sd_components <- varCorr_df[['sdcor']]
    names(sd_components) <- varCorr_df[['grp']]

    sd_total <- sqrt(sum(sd_components^2))

    c(sd_components, "Total" = sd_total)
}

confint(m, method = 'boot', FUN = get_total_sd, nsim = 10000)
#>              2.5 %    97.5 %
#> group    0.7655064 1.0329894
#> Residual 0.4921279 0.5657032
#> Total    0.9290245 1.1600250

confint(m, oldNames = FALSE)
#> Computing profile confidence intervals ...
#>                            2.5 %    97.5 %
#> sd_(Intercept)|group  0.77542074 1.0439971
#> sigma                 0.49426283 0.5677789
#> (Intercept)          -0.01914245 0.3472213

Created on 2021-06-03 by the reprex package (v2.0.0)

The bootstrapped (percentile) CIs for the individual SDs are similar to the profiled, but not equal. I'm not sure which are better. This does take a few minutes to run, so I'm still interested in other approaches.

JohannesNE
  • 1,343
  • 9
  • 14