5

If I want to use the the boot() function from R's boot package for calculating the significance of the Pearson correlation coefficient between two vectors, should I do it like this:

boot(re1, cor, R = 1000)

where re1 is a two column matrix for these two observation vectors? I can't seem to get this right because cor of these vectors is 0.8, but the above function returns -0.2 as t0.

Siguza
  • 21,155
  • 6
  • 52
  • 89
Fedja Blagojevic
  • 813
  • 1
  • 10
  • 18
  • 4
    [R FAQ: How can I generate bootstrap statistics in R?](http://www.ats.ucla.edu/stat/r/faq/boot.htm) + remember that a null hypothesis test is significant iff the corresponding CI does not contain the value of the test statistic under the null. – caracal Oct 20 '11 at 10:24

1 Answers1

7

Just to emphasize the general idea on bootstrapping in R, although @caracal already answered your question through his comment. When using boot, you need to have a data structure (usually, a matrix) that can be sampled by row. The computation of your statistic is usually done in a function that receives this data matrix and returns the statistic of interest computed after resampling. Then, you call the boot() that takes care of applying this function to R replicates and collecting results in a structured format. Those results can be assessed using boot.ci() in turn.

Here are two working examples with the low birth baby study in the MASS package.

require(MASS)
data(birthwt)
# compute CIs for correlation between mother's weight and birth weight
cor.boot <- function(data, k) cor(data[k,])[1,2]
cor.res <- boot(data=with(birthwt, cbind(lwt, bwt)), 
                statistic=cor.boot, R=500)
cor.res
boot.ci(cor.res, type="bca")
# compute CI for a particular regression coefficient, e.g. bwt ~ smoke + ht
fm <- bwt ~ smoke + ht
reg.boot <- function(formula, data, k) coef(lm(formula, data[k,]))
reg.res <- boot(data=birthwt, statistic=reg.boot, 
                R=500, formula=fm)
boot.ci(reg.res, type="bca", index=2) # smoke
chl
  • 27,771
  • 5
  • 51
  • 71
  • 1
    I tried this and it worked with `cor(vec[i,])[1,2]` type function, but now I am not sure how to interpret results, and what bias in boot output means. Thank you. – Fedja Blagojevic Oct 30 '11 at 18:42