bootstrap samples by row of a data frame in r

Question

I am trying to run a simple bootstrap on the rows of a data frame in r. Here is what I have worked up so far, but I'm hitting a dead end.

x1 <- c(1:5)
x2 <- c(6:10)
y <- runif(5)
z <- as.data.frame(rbind(x1, x2, y))

trial <- 10
avg <- rep(0, trial)
for(i in 1:trial){
  ind <- sample(ncol(z), size = ncol(z), replace = TRUE)
  z.boot <- z[ind, ]
  mean[i] <- mean(z.boot)
}
mean

Ideally, what I would like to do is to get a bootstrap weighted mean for the first and second rows with the weights in the third row but I can't even get my loop to work. There has to be a better way to do this. Any help is appreciated

You sample from `ncol(z)`, but then you subset by row. You also try to find the `mean` of a `data.frame` (which is not defined). What are you trying to do? — nicola, Oct 25 '16 at 07:54
I suppose another option would be to sample using the weights as probabilities and looking for the median. Really, I'm just trying to learn this technique as applied to rows of a data frame. — mike, Oct 25 '16 at 17:01

score 0 · Answer 1 · answered Oct 25 '16 at 07:50

try this... I don't quite get your point about weighted mean... but you can maybe work it out from here:

n= seq( 100, 500, 50)    
bootdata=list()
for (i in 1:length(n)) {
   bootdata[[i]]=data[sample(nrow(data), n[i], replace=TRUE), ]
}
bootdata
str(bootdata[[1]])

score 0 · Answer 2 · answered Jun 19 '19 at 07:28

Here is how a non-parametric bootstrap could be done. (This seems to be the type you are trying to do, based on your code.) Please note that nrow() and not the ncol() is the proper function. Bootstraps which are stored as items of the list "bootResult" could be retrieved via their index, like bootResult[[2]] and go through the next steps:

  nBoots<-10 #number of bootstraps
  bootResult<-list()
  for (i in seq_len(nBoots)){
    bootResult[[i]]<-z[sample(seq_len(nrow(z)), nrow(z), replace=TRUE), ]
  }

bootstrap samples by row of a data frame in r

2 Answers2

Linked