It is the first time I am using parallel processing in general. The question is mainly about my poor syntax.
I would like some help in capturing the output for a large number of cv.glmnet iterations, as I believe I have built cv_loop_run to be badly inefficient. This, along with the number of lambdas being 10k leads to a massive matrix which takes all of my memory and causes a crash. In essence what I need is the minimum and the 1se lambda by each run (1000 of them, not all 10,000). So instead of having a 1kx10k list captured for cv_loop_run I would get a 1k long list.
registerDoParallel(cl=8,cores=4)
cv_loop_run<- rbind( foreach(r = 1:1000,
.packages="glmnet",
.combine=rbind,
.inorder =F) %dopar% {
cv_run <-cv.glmnet(X_predictors,Y_dependent,nfolds=fld,
nlambda = 10000,
alpha = 1, #FOR LASSO
grouped = FALSE,
parallel= TRUE
)
}
)
l_min<- as.matrix(unlist(as.matrix(cv_loop_run[,9 ,drop=FALSE] ))) # matrix #9 is lamda.min
l_1se<- as.matrix(unlist(as.matrix(cv_loop_run[,10 ,drop=FALSE] ))) # matrix #10 is lamda.1se