I am currently trying parallel computing in R. I am trying to train a logistic ridge model , and I currently have 4 Cores on my computer. I would like to split my data set equally into 4 pieces, and use each core to train model (on the training data) and save the result of each core into a single vector . the problem is that i have no clue how to do it, right now I tried to parallel with the foreach package, but the problem is the each core sees the same training data. here is the code with the foreach package (which doesn't split the data) :
library(ridge)
library(parallel)
library(foreach)
num_of_cores <- detectCores()
mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv")
data_per_core <- floor(nrow(mydata)/num_of_cores)
result <- data.frame()
r <- foreach(icount(4), .combine = cbind) %dopar% {
result <- logisticRidge(admit~ gre + gpa + rank,data = mydata)
coefficients(result)
}
any idea how to simultaneously split the data into x chunks and train the models in parallel ?