I am using an HPC slurm cluster with Open MPI. The administrators would like everyone to use srun
instead of mpirun
. I have been using mpirun
for years and quasi all discussions online on using R with MPI employ mpirun
instead of srun
. Are there any drawbacks of utilizing srun
instead of mpirun
with R? Do I have to adjust the code? The administrators of the cluster I am working with unfortunately do not know that as they have no experience with R.
Here is an mpirun
example where I set up the parallel process outside R and then attach to it using the Rmpi
package.
Job script:
#!/bin/bash
#SBATCH --job-name=Example
#SBATCH --partition=Something
#SBATCH --nodes=2
#SBATCH --tasks-per-node=30
#SBATCH --time=24:00:00
#SBATCH --mail-user=example@example.com
#SBATCH --mail-type=END,FAIL
#SBATCH --export=NONE
module load openmpi
module load r/4.2.2
mpirun Rscript --no-save --no-restore $HOME/Example.R
R script:
if(!is.loaded("mpi_initialize")) {
library("doMPI")
}
cl <- startMPIcluster(comm = 0)
registerDoMPI(cl)
.Last <- function() {
if(is.loaded("mpi_initialize")) {
if(mpi.comm.size(1) > 0) {
mpi.close.Rslaves()
}
.Call("mpi_finalize")
}
}
foreach(x = something) %dopar% {
}
closeCluster(cl)
mpi.quit()
Instead of using foreach
loops with doMPI
, you could also employ the apply
functions in Rmpi
or snow
.
Some people prefer to have mpirun
to only generate a single process and then spawn slaves from within R. In that case, you call mpirun
with the -np 1
option and call mpi.spawn.Rslaves
in the R script.
Back when I started working with R on HPC clusters srun
did not work as intended with Rmpi
, which is why I eventually used mpirun
. So, what do I have to consider in replacing mpirun
with srun
? How does the code have to be changed? Feel free to not just comment on the Rmpi
but also the pbdMPI
use case.