I would like to use the boot()
and boot.ci()
functions from library("boot")
for a large data set(~20 000) with type="bca"
.
If R
(number of bootstraps) is too small (I have tried 1k - 10k), then I get the following error:
Error in bca.ci(boot.out, conf, index[1L], L = L, t = t.o, t0 = t0.o, :
estimated adjustment 'a' is NA
However, if I do 15k - 20+k bootstraps, then I get:
Cannot allocate vector size # GB
(usually ranging from 1.7 to 6.4gb, depending on the dataset and # of bootstraps).
I read that I needed to have more ram, but I have Windows desktop with 16gb ram and I'm using 64-bit R, suggesting my computer should be able to handle this.
How can I use bootstrapping methods on larger datasets if too few bootstraps cannot produce estimates and sufficient bootstraps results in insufficient memory?
My code:
multRegress<-function(mydata){
numVar<<-NCOL(mydata)
Variables<<- names(mydata)[2:numVar]
mydata<-cor(mydata, use="pairwise.complete.obs")
RXX<-mydata[2:numVar,2:numVar]
RXY<-mydata[2:numVar,1]
RXX.eigen<-eigen(RXX)
D<-diag(RXX.eigen$val)
delta<-sqrt(D)
lambda<-RXX.eigen$vec%*%delta%*%t(RXX.eigen$vec)
lambdasq<-lambda^2
beta<-solve(lambda)%*%RXY
rsquare<<-sum(beta^2)
RawWgt<-lambdasq%*%beta^2
import<-(RawWgt/rsquare)*100
result<<-data.frame(Variables, Raw.RelWeight=RawWgt,
Rescaled.RelWeight=import)
}
# function passed to boot
multBootstrap <- function(mydata, indices){
mydata<-mydata[indices,]
multWeights<-multRegress(mydata)
return(multWeights$Raw.RelWeight)
}
# call boot
multBoot<-boot(thedata, multBootstrap, 15000)
multci<-boot.ci(multBoot,conf=0.95, type="bca")