How do I simulate data for a power analysis of a repeated measure linear mixed effects regression using simr?

Question

I would like to conduct a simulation-based power analysis for a linear mixed model in lmer with repeated measures from scratch. I understand that simr might be the package to go with. However, I do not understand what I have to do and am not able to make sense of anything I read online.

I want to test a random intercept model where

y (a z-standardized outcome variable) ~
- a + b (z-standardized covariates)
- + c + d (binary covariates)
- + time1 + time2 (time dummies indicating measurement at t1 or t2 as opposed to t0)
- + group (a group with two expressions)
- + group:time1 + group:time2
- + (1 | participant)

i.e., three measurements are nested in participants using a random intercept.

How do I simulate data so that I am able to find out what would be an appropriate sample size for an interaction effect (e.g. group:time1) of beta = .3 to be found with a power of .80? And which additional info do I need to do that?

Robert Long · Answer 1 · 2022-01-26T19:53:58.043

There are packages such as simr which will do all of this, and more, for you (and will handle unbalanced designs too), but here is a simple approach to simulating data for a mixed model, which you can then use in a power analysis, from scratch:

There are several important parameters to consider:

the overall data size
whether the data are balanced or not
the number of participants
the fixed effects
the random effects variance
the residual variance

Here we use a balanced design, which makes the setup easier, but it would not be too hard to extend it to unbalanced designs:

library(lme4)
set.seed(15)

N <- 10    # number of unique values of a and b for each participant
N.participant <- 20   # number of participants
betas <- c(10, 1, 2, 3, 4, 5, 6, 7)   # fixed effects
s <- 0.5   # variance of random effects
e <- 1  # residual variance

# form the balanced data frame
dt <- expand.grid(a = rnorm(N), b = rnorm(N), c = c(0,1), d = c(0,1), time = c(0,1,2), group = c(0,1), participant = LETTERS[1:N.participant])

# form the model matrix for fixed effects
X <- model.matrix(~ a + b + c + d + time + group + time:group, data = dt)

# set a vector of 1s for the outcome. This is needed so we can easily compute the
# model matrix for random effects (in this case just the random intercetps)
dt$y <- 1

# The final model formula
myFormula <- "y ~ a + b + c + d + time + group + time:group + (1 | participant)"

# obtain the model matrix for random effects
foo <- lFormula(eval(myFormula), dt)
Z <- t(as.matrix(foo$reTrms$Zt))

# simulate the random effects
u <- rnorm(N.participant , 0, s)

# simulate y using the general mixed model equation and residual variance e
dt$y <- X %*% betas + Z %*% u + rnorm(nrow(dt), 0, e)

# fit the model
m0 <- lmer(myFormula, dt)
summary(m0)

You would run this a number of times for each set of parameters, with different seeds of course, and compute the relevant Monte Carlo estimates, and the achieved power for each set.

Huge thanks for your answer and your willingness to help! I am still not able to understand everything you write, I guess it might help me if you`d either comment the code a bit more. Or could you help me how I would do it with simr? All the best to you! — Sebastian N, Jan 26 '22 at 17:24
You're welcome. I have already included several comments in the code already but if there is anything in particular that you don't understand, I can happily add some more. — Robert Long, Jan 26 '22 at 19:17

How do I simulate data for a power analysis of a repeated measure linear mixed effects regression using simr?

1 Answers1