Standard error and bias zero with bootstrapping

Question

I want to take my dataset bodyfat_trimmed and use bootstrapping to retrieve the mean and the standard errors. However, I seem to be using the same data all the time and therefore get zero standard error and bias. How can I solve this?

bsfunc <- function(data) {
    set.seed(1)
    x <- model.matrix(reduced_BIC_fit)[, -1]
    y <- data$density
    bootdata <- sample(1:nrow(x), nrow(x)/2)
    x.train <- x[bootdata, ]
    y.train <- y[bootdata]
    bootframe <- data.frame(bodyfat_trimmed[train, ])
    fit <- lm(density ~ age + abdomen + wrist, data = bootframe)
    stats <- coef(summary(fit))[, "Estimate"]
    return(stats)}
strap <- boot(data = bodyfat_trimmed, sim = "parametric", statistic =    bsfunc, R=1000)
strap

Output:

PARAMETRIC BOOTSTRAP


Call:
boot(data = bodyfat_trimmed, statistic = bsfunc, R = 1000, sim =  "parametric")


Bootstrap Statistics :
         original  bias    std. error
t1*  1.1360858253       0           0
t2* -0.0000889957       0           0
t3* -0.0018446625       0           0
t4*  0.0050609837       0           0

Your function has `set.seed(1)` in the first line. What happens if you take that out? — Stephen Henderson, Mar 01 '18 at 07:07

Stephen Henderson · Answer 1 · 2018-03-01T07:28:20.837

1

If the seed is within the function the sample function will be somewhat repetitive.

bsfunc<-function(){set.seed(1); sample(1:10,1)}
bsfunc()
[1] 3
bsfunc()
[1] 3
bsfunc()
[1] 3

PS Your bsfunc is also misconceived. As written, train (i.e. bootframe <- data.frame(bodyfat_trimmed[train, ])) doesn't come from within this function. And normally the whole point of boot is to do the bootstrap resampling. whilst bsfunc should just be a straight statistic.

edited Mar 01 '18 at 07:28

answered Mar 01 '18 at 07:16

Stephen Henderson

6,340
3
27
33

1

Thanks a lot, this brings clarity! So, putting the seed outside and using the bsfunc just for the statistic and fit, I still produce the same result over and over. `set.seed(1) bsfunc <- function(data) { fit <- lm(density ~ age + abdomen + wrist, data = data) stats <- coef(summary(fit))[, "Estimate"] return(stats)} strap <- boot(data = bodyfat_trimmed, sim = "parametric", statistic = bsfunc, R=1000) strap` I don't quite understand how the boot() function samples. – Johannes Persson Mar 01 '18 at 18:56
I'm having the same problem... Did you ever get a fix on this issue? – Elliot Oct 26 '20 at 04:22

Standard error and bias zero with bootstrapping

1 Answers1