Optimizing nested foreach dopar in R

Question

I'd like input on how my code below is structured. Would like to know if it needs to be organized in a different way to execute faster. Specifically, whether I need to be using foreach and dopar differently in the nested loops. Currently, the inner loop is the bulk of the work (ddply with between 1-8 breakdown variables, each of which has 10-200 levels), and that's what I have running in parallel. I left out the code details for simplicity.

Any ideas? My code, as organized below, does work, but it takes a few hours on a 6-core, 41gb machine. The dataset isn't that large (< 20k records).

for(m in 1:length(Predictors)){  # has up to three elements in the vector

  # construct the dataframe based on the specified predictor
  # subset the original dataframe based on the breakdown variables, outcome, predictor and covariates

  for(l in 1:nrow(pairwisematrixReduced)){  # this has 1-6 rows;subset based on correct comparison groups

    # some code here

    cl <- makeCluster(detectCores())  
    registerDoParallel(cl) 

    for (i in 1:nrow(subsetting_table)){  # this table has about 50 rows

      # this uses the columns specified by k in the glm; the prior columns will be used as breakdown variables
      # up to 10 covariates
      result[[length(result) + 1]] <- foreach(k = 11:17, .packages=c('plyr','reshape2', 'fastmatch')) %dopar% {   

        ddply( 
          df,
          b,   # vector of breakdown variables
          function(x) { 

           # run a GLM and manipulate the output

          ,.parallel = TRUE) # close ddply
      } # close k loop -- set of covariates
    } # close i loop -- subsetting table
  } #close l -- group combinations
} # close m loop - this is the pairwise predictor matrix 

stopCluster(cl)
result <- unlist(result, recursive = FALSE)
tmp2<-do.call(rbind.fill, result)

I recommend running the following commands and read about the difference between `%dopar%` and `%:%`. `vignette("foreach")` and `vignette("nested")` — manotheshark, Dec 26 '16 at 20:49
I missed that `foreach`, but I'd recommend reading the vignette I mentioned above. It will be easier to write both the `foreach` and `ddply` as `foreach` statements with one using `%dopar` and the other `%:%`. The vignette talks about benefits of paralleling the inner vs outer loop. You'll need to test your own code as it is data dependent, but it becomes as easy as swapping the `%dopar%` and `%:%` between the two `foreach`. — manotheshark, Dec 26 '16 at 20:58
You'll also likely see a speed boost by moving the `makeCluster` and `registerDoParallel` outside of the `for` loops as you are creating `m` * `l` times when once should be sufficient. — manotheshark, Dec 26 '16 at 21:12

score 5 · Answer 1 · edited Jun 20 '20 at 09:12

Copied out of vignette("nested")

3 Using %:% with %dopar%

When parallelizing nested for loops, there is always a question of which loop to parallelize. The standard advice is...

You also are using foreach %dopar% along with ddply and .parallel=TRUE. With a six core processor (and presumably hyper threading) means the foreach block would start 12 environments and then the ddply would start 12 environments within each of those for 144 simultaneous environments. The foreach should be changed to %do% to be consistent with your questions text of running the inner loop in parallel. Or to make it cleaner, change both to foreach and use %dopar% for one loop and %:% for the other.

Optimizing nested foreach dopar in R

1 Answers1