I am analyzing a study which contains 40 individuals, each rating 10 vignettes.
indiv vign score score2 gender
1 1 5 3 1
1 2 2 4 1
1 3 8 1 1
. . . . .
. . . . .
. . . . .
39 10 9 1 1
40 8 1 5 0
40 9 3 8 0
I wanted to take a bootstrap, but I realized soon that it does not make sense to sample vignettes; we should sample persons instead (so we sample around 10 rows per person).
The following function works, but it is kind of the bottleneck for the next function. The question is then, how can this be done more efficiently?
ResampleMultilevel <- function(data, groupvar) {
n <- length(unique(data[,groupvar]))
index <- sample(data[ , groupvar], n, replace = TRUE)
resampled <- NULL # one of the issues is that we do not know
# the size of the matrix yet, since it may vary.
for (i in 1:n) {
resampled <- rbind(resampled, data[data[, groupvar] == index[i], ])
}
return(resampled)
}
The issue with subset is that I couldn't find a way to keep duplicates.
a <- cbind(rep(1:40, each = 10), rep(1:10, 4), rnorm(40), rnorm(40)), rep(1:10, 4), rnorm(40), rnorm(40))
index <- c(1,1)
subset(a, a[,1] == index)