Here is what I am trying to do using the foreach package. I have data set with 600 rows and 58000 column with lots of missing values.
We need to impute the missing values using package called "missForest" in which it is not parallel, it takes to much time to run this data at once.
so, I am thinking to divide the data into 7 data sets (I have 7 cores) with the same number of rows (my lines) and different number of col ( markers).
Then using %dopar%
to pass the data sets in parallel to missForest?
I do not see how to divide the data into smaller data sets and pass those data sets to missForest then recombine the outputs!
I will appreciate it so much if you can show me how?
Here is a small example, form BLR package, demonstrating my problem:
library(BLR)
library(missForest)
data(wheat)
X2<- prodNA(X, 0.1)
dim(X2) ## i need to divide X2 to several 7 data frames (ii)
X3<- missForest(X2)
X3$Ximp ## combine ii data frames