0

I know there are many posts about the issues with memory consumption of mclapply but still I'm trying to see whether there's anything that can help my case.

I'm fitting a random forest model to a ~600 by 60,000 (response y by variables matrix X) matrix:

library(randomForest)
fit <- randomForest(x=X,y=y)

I then want to compare that fit to a random fit and for that what I'm doing is:

library(parallel)
set.seed(1)
random.list <- mclapply(1:1000,function(f){
  idx <- shuffle(nrow(X))
  random.y <- predict(object=fit,newdata=X[idx,],type="response")
}, mc.cores = ncores)

Unfortunately this is too memory intensive (requires more than 100GB) which makes it impractical.

BTW the environment I'm running on is Linux.

Any suggestions?

svick
  • 236,525
  • 50
  • 385
  • 514
dan
  • 6,048
  • 10
  • 57
  • 125

1 Answers1

0

Seems like mclapply2 {snpEnrichment} is a reasonable effortless solution

dan
  • 6,048
  • 10
  • 57
  • 125