0

I am analyzing a study which contains 40 individuals, each rating 10 vignettes.

indiv     vign      score    score2    gender    
  1         1         5         3        1
  1         2         2         4        1   
  1         3         8         1        1
  .         .         .         .        .
  .         .         .         .        .
  .         .         .         .        .
  39       10         9         1        1 
  40        8         1         5        0 
  40        9         3         8        0 

I wanted to take a bootstrap, but I realized soon that it does not make sense to sample vignettes; we should sample persons instead (so we sample around 10 rows per person).

The following function works, but it is kind of the bottleneck for the next function. The question is then, how can this be done more efficiently?

ResampleMultilevel <- function(data, groupvar) {
  n <- length(unique(data[,groupvar]))

  index <- sample(data[ , groupvar], n, replace = TRUE)

  resampled <- NULL      # one of the issues is that we do not know 
                         # the size of the matrix yet, since it may vary. 
  for (i in 1:n) {
   resampled <- rbind(resampled, data[data[, groupvar] == index[i], ])
  }
  return(resampled)
}

The issue with subset is that I couldn't find a way to keep duplicates.

a <- cbind(rep(1:40, each = 10), rep(1:10, 4), rnorm(40), rnorm(40)), rep(1:10, 4), rnorm(40), rnorm(40))

index <- c(1,1)

subset(a, a[,1] == index)
Siguza
  • 21,155
  • 6
  • 52
  • 89
PascalVKooten
  • 20,643
  • 17
  • 103
  • 160

2 Answers2

0

Based on comments, I am ammending answer.

a <- cbind(rep(1:40, each = 10), rep(1:10, 4), rnorm(40), rnorm(40))
index <- c(1, 1, 3, 4, 2)
a[a[, 1] %in% index, ]
##       [,1] [,2]        [,3]        [,4]
##  [1,]    1    1  0.28135473  0.47970116
##  [2,]    1    2 -0.12628982  0.34862899
##  [3,]    1    3 -0.41140740  1.30204100
##  [4,]    1    4 -0.61163593 -1.13354157
##  [5,]    1    5 -0.31538238  1.42701315
##  [6,]    1    6 -0.20403098  2.13989392
##  [7,]    1    7  0.37681973  0.65843232
##  [8,]    1    8 -0.94062165  0.97246212
##  [9,]    1    9  0.63377352 -0.48948273
## [10,]    1   10 -0.39817929 -1.03607028
## [11,]    2    1  0.54866153 -0.55127459
## [12,]    2    2  0.08410140  0.01457366
## [13,]    2    3 -1.19006851  1.33213116
## [14,]    2    4 -0.47210092  0.83369309
## [15,]    2    5  0.75968678 -0.48212390
## [16,]    2    6 -1.00205770  0.56376027
## [17,]    2    7  0.67251644  0.07234657
## [18,]    2    8  0.73165780 -0.51483172
## [19,]    2    9 -0.26022238  2.33181762
## [20,]    2   10  0.03370091 -0.71427295
## [21,]    3    1  0.60810461  0.15054307
## [22,]    3    2 -1.29363706  1.30510127
## [23,]    3    3 -0.20479713 -2.39797975
## [24,]    3    4 -0.86927664 -0.10845738
## [25,]    3    5  0.89040130 -0.08459249
## [26,]    3    6 -0.21511823  1.33960644
## [27,]    3    7 -0.32413278 -0.31691484
## [28,]    3    8 -0.61545941 -0.10457591
## [29,]    3    9 -1.85072358  0.93267270
## [30,]    3   10  0.38456423  0.76231047
## [31,]    4    1  0.76016236  1.63854054
## [32,]    4    2 -0.94463491  1.87271085
## [33,]    4    3  1.62451250  1.63298961
## [34,]    4    4 -1.96908559  0.89058201
## [35,]    4    5  1.66755533  0.10288947
## [36,]    4    6 -0.02182803 -0.91358891
## [37,]    4    7 -0.09382921 -0.54950093
## [38,]    4    8  0.74597002  2.31924468
## [39,]    4    9  0.64732694  0.29681494
## [40,]    4   10 -0.66535049  1.81285111
CHP
  • 16,981
  • 4
  • 38
  • 57
0

a <- index <- 5:10

This almost works, except that the structure is not really the matrix I would like it to be.

lapply(index, function(x) a[which(a[,1] == x),])

Also, this almost gets there, if there would be a non-loop way to do this that would be great, because here it only works for the number 2:

a[which(a[,1] == 2),]       # works
a[which(a[,1] == index), ]  # does not work
PascalVKooten
  • 20,643
  • 17
  • 103
  • 160