My dataset looks like (call it data_xy
)
id X Y
1 5 10
1 6 11
1 4 8
2 3 9
2 3 12
3 4 10
...
observations from a total of N ids. Each id has several rows of measurements.
I want to bootstrap the id with replacement. It is very likely that the bootstrap id contains duplicates.
b_idx <- sample.int(N,N,T)
it's likely that
b_idx=c(1,1,3,4,4,4....)
Then how to create the bootstrap sample with b_idx
? If I do
data_xy[data_xy$id==b_idx,]
each id
(with its repeated measures) will occur only ones in my bootstrap dataset. What I really want is to replicate the observations for id=k
the number of times this id occurs in b_idx
. How can I achieve this?