-1

I have two different sized datasets, so am attempting to use the bootstrap function. I have completed the code below, but am not sure how to interpret the results. Any help would be great

# bootstrapping with 1000 replications
results <- boot(data=Timedata, statistic=rsq,
                R=1000, formula=Max_Height~Age*Time_period)

# view results
`results
plot(results)`

ORDINARY NONPARAMETRIC BOOTSTRAP

Call:

boot(data = Timedata, statistic = rsq, R = 1000, formula = Max_Height ~ 
    Age * Time_period)

Bootstrap Statistics :

     original        bias    std. error
t1* 0.7439122 -0.0003189452  0.06257858

get 95% confidence interval

boot.ci(results, type="bca") ?boot.ci

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS Based on 1000 bootstrap replicates

CALL : boot.ci(boot.out = results, type = "bca")

Intervals : Level BCa
95% ( 0.5794, 0.8343 )
Calculations and Intervals on Original Scale

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
JaneEarland
  • 1
  • 1
  • 2

1 Answers1

2

It's difficult to interpret your example without any context or reproducibility so I'll describe in detail how to interpret the results using a simplified boot example.

Using the standard mtcars data, suppose we want to calculate the Bootstrap mean of the mpg column. That is, we have the sample mean, but we want to get the Bootstrap mean of mpg.


    library(boot)

    set.seed(231241)
    mean_df <- function(dataset, i) mean(dataset[i, "mpg"])
    res <- boot(mtcars, mean_df, 999)
    res
    #> 
    #> ORDINARY NONPARAMETRIC BOOTSTRAP
    #> 
    #> 
    #> Call:
    #> boot(data = mtcars, statistic = mean_df, R = 999)
    #> 
    #> 
    #> Bootstrap Statistics :
    #>     original      bias    std. error
    #> t1* 20.09062 -0.01391391    1.060334

The output of boot means that the original mean 20.09062 is the mean of mpg on without bootstrap. We can check that with:


    mean(mtcars$mpg)
    #> [1] 20.09062

The bias column showing -0.01391391 shows the difference between the bootstrapped mean and the sample mean from above. We can check that with:


    # Bias is the difference between the sample mean
    # and the sample Bootstrap
    mean(res$t) - res$t0
    #> [1] -0.01391391

Since the bootstrap is just N number of resamples from the original data, it should contain a resample of means. The standard error presented here is just the standard deviations of all of your bootstrapped means. We can check that with:


    # And the standard error is the standard deviation
    # of the bootstrapped sample
    sd(res$t)
    #> [1] 1.060334

You can probably extrapolate this to your example. For example, your original result of 0.7439122 is probably the R square of your model, where the bias of -0.0003189452 is the average difference between your bootstrapped R squares. Additionally, the standard error is just the standard deviation of all your bootstrapped R squared.

Finally, the boot.ci just calculates the confidence interval of your bootstrapped estimate described above with the standard error of all your bootstraps.


    res_ci <- boot.ci(res)
    #> Warning in boot.ci(res): bootstrap variances needed for studentized intervals

    res_ci
    #> BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
    #> Based on 999 bootstrap replicates
    #> 
    #> CALL : 
    #> boot.ci(boot.out = res)
    #> 
    #> Intervals : 
    #> Level      Normal              Basic         
    #> 95%   (18.03, 22.18 )   (17.95, 22.16 )  
    #> 
    #> Level     Percentile            BCa          
    #> 95%   (18.02, 22.23 )   (18.08, 22.31 )  
    #> Calculations and Intervals on Original Scale

cimentadaj
  • 1,414
  • 10
  • 23