0

Trying around with the boot package in R and it does not accomplish what I want, it just returns the same value. I have read the documentation and checked the first example which seems to be the same, but I don't get any bootstrapped results, just the original value is returned. My example:

dat      <- data.frame(A = rnorm(100), B = runif(100))

booty    <- function(x, ind) sum(x$A)/sum(x$B)

boot_out <- boot(dat, booty, R  = 50, stype = "w")
Michael Petch
  • 46,082
  • 8
  • 107
  • 198
MLEN
  • 2,162
  • 2
  • 20
  • 36

2 Answers2

0

Like I've said in the comment, you must use the index, in this case ind, to subset the dataset passed to the bootstrap statistic function, sum(x$A[ind])/sum(x$B[ind]).
The full function would then become

booty    <- function(x, ind) sum(x$A[ind])/sum(x$B[ind])

Two notes.
One, the stype = "i" argument states that the second argument to booty is a vector of indices, which, I believe, is what you want, not a vector of weights, stype = "w".
Also, in order to have reproducible results, the RNG seed should be set before calling boot or RNG functions. Something like the following.

set.seed(4237)

dat      <- data.frame(A = rnorm(100), B = runif(100))

boot_out <- boot(dat, booty, R  = 50, stype = "i")
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
0

Just to add on to Rui's answer, it is possible to to use stype = "w" to perform a standard bootstrap. It's maybe easier to think about what doing stype = "f" does first.

Setting stype = "f" first resamples the indices of the original dataset (just like the usual bootstrap with stype = "i"), then tallies up how many times each unit is selected, and turns that tally into a frequency weight. For example, if a unit was sampled twice in the bootstrap sample, that unit would get a weight of 2. A unit not sampled would get a weight of 0. You can use these weights to calculate weighted statistics, which are then the output of the bootstrap procedure. These give equivalent results to subsetting the data based on the sampled indices.

So, adapting your code, using stype = "f" would look like the following:

at      <- data.frame(A = rnorm(100), B = runif(100))

booty    <- function(x, w) weighted.mean(x$A, w)/weighted.mean(x$B, w)

boot_out <- boot(dat, booty, R  = 50, stype = "f")

We have to adjust the statistic function (here called booty) to accept weights and then use those weights in computing the statistic of interest. If you were running a regression in each sample, you would supply the weights to the weights argument of the regression function.

Setting stype = "w" is identical to using stype = "f" except that the weights are divided by the sample size. This is just a convenience, so that, e.g., weighted means can be represented as weighted sums instead. This is exactly what is done in the documentation example for boot::boot().

Noah
  • 3,437
  • 1
  • 11
  • 27