1

I created own R package with functions that use parallel functions like makeCluster, parLapply etc. However, they are much slower inside the package as used outside. There are slower initialization of cluster, and exporting objects... Do you have tips how to use properly parallel functions inside an own R package?

Example of using parallel:

cl <- parallel::makeCluster(parallel::detectCores()-1)
parallel::clusterExport(cl, varlist = c("data"), envir = environment())
parallel::clusterCall(cl, function() {library(myPackage)})
data_res <- parallel::parLapply(cl, 1:nrow(data), function(i) {
    tryCatch(myFun(data[i,]), error = function(err) {return(data.table(row = i))})
})
if(!is.null(cl)) {
  parallel::stopCluster(cl)
  cl <- c()
}
gc()

Thanks

  • 1
    That's bad parallelization. The whole `data` has to be copied to the workers. You should use `split` and iterate over the resulting list. That way, only the subset of the data has to be transferred. – Roland Oct 27 '20 at 14:45
  • @Roland please, do you have an example? – Peter Laurinec Oct 27 '20 at 15:11
  • 1
    What @Roland is saying, is that rather than iterating over each row you could use `splits <- split(data, sort(1:nrow(data) %% (detectCores() - 1)))` to split your data. Doing this, and altering your function such that `data` is an input to the function, you could use `payLapply(cl, splits, myFun, other_args)`, iterating over the rows or each split within `myFun`. That way only a small subset of your data will be exported. This does not explain **why** it is slower within your package however. Without more information it is not possible to say. – Oliver Oct 27 '20 at 15:18
  • No, because your example is nonsense. Subsetting rows with an integer can't return an error. And the whole loop can be replaced by `split(data, seq_len(nrow(data)))` which would be more efficient. You could do something like `parLapply(cl, split(data, seq_len(nrow(data))), function(subset) <...>)` if you want to do something with the subsets where performance profits from parallelization. Chunking like recommended by @Oliver can be a sensible option. – Roland Oct 27 '20 at 15:18

0 Answers0