last year around August I wrote the following code to simulate power for multilevel models. It worked fine for a long time, but now it seems that broom::tidy() has been deprecated for multilevel models. I don't understand how this impacts my code exactly but here's what's happening:
#Power simulation for mixed effects models
library(nlme)
library(dplyr)
library(tidyr)
library(purrr)
# 1. First, determine your parameters. The below numbers will create 4 dfs with different Ns
#(eg N=5, 10, 50, & 200) and cross each level with the other level of the other parameters specified.
params <- expand.grid(
N = c(50, 135, 1000), #this is the N for each sample df
B = c(.1, .3, .5), #the effect size B, a linear coefficient of x predicting y
items = c(5, 10, 20)) #the number of repeating within-subject time points, items, etc
n_samples <- 100 #how many simulations you want to run
#2. Then, run this entire chunk with cmd + shift + enter
#Don't modify anything below
###############################################################
#create a sample
sim_sample <- function(N, B, items){
tid <- rep(1:items, N) #time id, not really needed but maybe useful
subj <- rep(1:N, each=items) #subject id
x = sample(1:6, items*N, replace = T) #a fixed effect predictor
y <- B*x + rnorm(n = items*N) #y is defined as being predicted by x + random error
smaller.df <- data.frame(tid, subj, x, y) #combine them into a df
lmeout <- summary(lme(y ~ x, random=~1|subj, data=smaller.df))$tTable %>%
data.frame()
broom::tidy(lmeout) #build a model, extract coefs, put coefs into a table
}
#another function to simulate all the parameters specified above
sim_all_params <- function(params) {
sim_results <- purrr::pmap(params, sim_sample)
params %>%
mutate(sim_results = sim_results) %>%
tidyr::unnest(sim_results)
}
#execute the simulation and re-run it the pre-specified amount of times
#sim_all_params(params)
full_results <- purrr::rerun(.n = n_samples, sim_all_params(params)) %>%
bind_rows(.id = 'sample_num') %>%
as.data.frame(full_results)
#save the full results in a table
table <- full_results %>%
#dplyr::filter(rownames == 'x') %>% #this select only the rows for the x predictor in the case that you have multiple predictors
dplyr::group_by(N, B, items) %>%
dplyr::summarise(power = sum(p.value < .05)/n()) #compute power for each row
#print the table
table
the simulation code no longer works for lme() power estimates are always around ~50%; if i increase sample sizes, power does not increase I've tried to use tibble() and data.frame() instead of broom::tidy() but I get weird results like power = 1.5 or 2.5.
I really don't understand what the deal is and why this change of broom::tidy() would affect this code so much.
On the other hand, the power simulations work fine when using the following code for non-multilevevel design (eg t-tests):
#Power simulation for t test
library(nlme)
library(dplyr)
library(tidyr)
library(purrr)
# 1. First, determine your parameters. The below numbers will create 4 dfs with different Ns
#(eg N=5, 10, 50, & 200) and cross each level with the other level of the other parameters specified.
params <- expand.grid(
N = c(20, 75, 150, 300), #this is the n in *each* group
d = c(.2, .3, .4, .5), #standardised mean difference (ie Cohen's d)
int = c(1, 1, 1, 1) #this is the intercept (ie mean)
)
n_samples <- 100 #how many simulations you want to run
#2. Then, run this entire chunk with cmd + shift + enter
#Don't modify anything below
###############################################################
#create a sample
sim_sample <- function(N, d, int){
d1 <- data.frame(x = rnorm(N, int, 1),
condition = 0)
d2 <- data.frame(x = rnorm(N, (int+d), 1),
condition = 1)
smaller.df <- rbind(d1, d2)
tout <- t <- broom::tidy(t.test(x ~ condition, smaller.df)) #build a model, extract coefs
#broom::tidy(tout) #put coefs into a table
}
#another function to simulate all the parameters specified above
sim_all_params <- function(params) {
sim_results <- purrr::pmap(params, sim_sample)
params %>%
mutate(sim_results = sim_results) %>%
tidyr::unnest(sim_results)
}
#execute the simulation and re-run it the pre-specified amount of times
#sim_all_params(params)
full_results <- purrr::rerun(.n = n_samples, sim_all_params(params)) %>%
bind_rows(.id = 'sample_num')
#save the full results in a table
table <- full_results %>%
group_by(N, d) %>%
summarise(power = sum(p.value < .05)/n()) #compute power for each row
#print the table
table
Can anyone help me
- fix the first code
- explain why it no longer works?
Thanks!