3

I have a script in R that takes advantage of the doParallel package and the parallelized foreach function. I am currently registering my cluster by using a variation of the detectCores() command, which works quite well as the machine I am using has 32 cores.

My question is, if I have access to HPC resources with multiple Linux machines, is it possible to detectCores() from multiple machines and implement them in a single foreach call?

For example, if I submit my HPC job so that it uses two nodes, is it possible to get the detectCores() function to produce a value of 64 rather than 32?

Scransom
  • 3,175
  • 3
  • 31
  • 51
amelcher
  • 61
  • 1
  • 7
  • I'm not sure I understand exactly what you're looking for. Are you asking how to set up a cluster of R workers such that they are spread across multiple machines and some machines will have multiple workers running? That can be done using `parallel::makeCluster()`. Or are you asking how you can use `detectCores()` to query your different machines for how many cores they have? – HenrikB Oct 20 '17 at 03:48
  • Yes, I want to use multiple machines and have multiple workers on each machine. How can I do that with the `parallel::makeCluster()` function without doing some version of MPI (`Rmpi`, `pbdMPI`, etc.)? In my own experimenting, I have found that if I make a cluster with more cores specified than physically available on my machine, then the simulations slow down drastically. How do I do a `makeCluster()` for multiple machines and how do I make sure that I am using the correct number of cores on each machine? – amelcher Oct 21 '17 at 04:39
  • 1
    For example, `parallel::makeCluster(c("n1", "n1", "n1" "n2", "n3"))` will set up a (PSOCK) cluster with 3 workers on machine `n1`, 1 worker on `n2` and 1 worker on `n3`. – HenrikB Oct 21 '17 at 20:04
  • 1
    That makes sense. So theoretically I could use like a `parallel::makeCluster(c(rep("n1",detectCores()), rep("n2", detectCores()), rep("n3", detectCores())), type = "PSOCK")` command? Assuming, of course, that each node has the same number of cores. Thanks a lot for the help. – amelcher Oct 23 '17 at 02:49
  • @amelcher Is there a way to use makeCluster() and then test how many cores are available in total on the cluster with detectCores()? I have a script that was originally written for one server with multiple cores (using mclapply() ) and would like to use it on a High Performance Computing cluster with as little changes as necessary. – Ju Ko Apr 11 '19 at 14:30
  • @JuKo I believe that's exactly what the accepted answer does. You'll see in the author's "find_workers" function the line `ns <- clusterCall(cl, fun = detectCores)`. This line gets the number of cores on each node in the cluster. Sum those and you should get the total number of cores available. Keep in mind that I believe there is a maximum number of workers that you can use with the `makeCluster()` command. I don't remember exactly what it is. I think it's 128. It was a while ago when I was doing this work! – amelcher Apr 12 '19 at 15:17
  • I see, thanks for the clarification! If I understand correctly, it's not possible though to use cores from multiple nodes with mclapply() and I have to use parlapply() instead? – Ju Ko Apr 13 '19 at 00:09
  • I believe that's true. I believe `mclapply()` uses only the cores on a single machine. – amelcher Apr 16 '19 at 16:51

1 Answers1

2

Example summarizing solution in the comments of the top post:

library("parallel")

find_workers <- function(nodes) {
  nodes <- unique(nodes)
  cl <- makeCluster(nodes)
  on.exit(stopCluster(cl))

  ns <- clusterCall(cl, fun = detectCores)
  rep(nodes, times = ns)
}

workers <- find_workers(c("n1", "n2", "n3"))
cl <- makeCluster(workers)
HenrikB
  • 6,132
  • 31
  • 34