R on slurm: Rmpi and srun

Question

I am using an HPC slurm cluster with Open MPI. The administrators would like everyone to use srun instead of mpirun. I have been using mpirun for years and quasi all discussions online on using R with MPI employ mpirun instead of srun. Are there any drawbacks of utilizing srun instead of mpirun with R? Do I have to adjust the code? The administrators of the cluster I am working with unfortunately do not know that as they have no experience with R.

Here is an mpirun example where I set up the parallel process outside R and then attach to it using the Rmpi package.

Job script:

#!/bin/bash
#SBATCH --job-name=Example
#SBATCH --partition=Something
#SBATCH --nodes=2
#SBATCH --tasks-per-node=30
#SBATCH --time=24:00:00
#SBATCH --mail-user=example@example.com
#SBATCH --mail-type=END,FAIL
#SBATCH --export=NONE
 
module load openmpi
module load r/4.2.2
 
mpirun Rscript --no-save --no-restore $HOME/Example.R

R script:

if(!is.loaded("mpi_initialize")) {
  library("doMPI")
}

cl <- startMPIcluster(comm = 0)
registerDoMPI(cl)
 
.Last <- function() {
  if(is.loaded("mpi_initialize")) {
    if(mpi.comm.size(1) > 0) {
      mpi.close.Rslaves()
    }
    .Call("mpi_finalize")
  }
}

foreach(x = something) %dopar% {
 
}

closeCluster(cl)
mpi.quit()

Instead of using foreach loops with doMPI, you could also employ the apply functions in Rmpi or snow.

Some people prefer to have mpirun to only generate a single process and then spawn slaves from within R. In that case, you call mpirun with the -np 1 option and call mpi.spawn.Rslaves in the R script.

Back when I started working with R on HPC clusters srun did not work as intended with Rmpi, which is why I eventually used mpirun. So, what do I have to consider in replacing mpirun with srun? How does the code have to be changed? Feel free to not just comment on the Rmpi but also the pbdMPI use case.

What happens when you replace `mpirun` with `srun` in the script? Do you see an error or problem? — AndyT, Feb 01 '23 at 20:27
If I remember correctly, R did back then not correctly attach to the parallel structure set up by `srun`. I have not tested it on the current infrastructure, but was hoping to learn about it in answers to this question. That would be more convenient than flooding the cluster with tests annoying administrators and other users. — Chr, Feb 02 '23 at 15:13
I do not think you need to flood the cluster to test. Just run a single Rmpi job and find out what happens. It is difficult to help without concrete information on the errors you see. — AndyT, Feb 02 '23 at 19:57
When replacing `mpirun` with `srun`, the job fails with the error message `slurmstepd: error: execve(): Rscript: No such file or directory`. It appears that, unlike `mpirun`, `srun` does not export the modules loaded in the job script. This post suggests to omit `srun`: https://stackoverflow.com/a/67316453. It is obviously not the desired solution as it just avoids the MPI-based parallelization, the reason why `srun` and `mpirun` are called. — Chr, Feb 06 '23 at 20:09
I think this is a problem specific to your HPC system. When I use `Rscript` with `srun` on the HPC system I have access to then I get the expected result from a simple test script. I simply load the R module and then `srun Rscript hello.R` and it all works fine. I think you need to speak to the local support team about this as it is not a general problem with Slurm + R, rather it is a local site configuration issue. — AndyT, Feb 08 '23 at 09:57
`srun` aka SLURM direct run can be preferred by sysadmins since all MPI tasks are under the control of SLURM (instead of having MPI daemons sitting in between). A major drawback is that `MPI_Comm_spawn()` does not work (to the best of my knowledge) and that looks like a deal breaker here. I suggest you present these facts to your sysadmin, they will likely advise you to stick to `mpirun`. — Gilles Gouaillardet, Feb 09 '23 at 10:01
@GillesGouaillardet MPI_Comm_spawn, is actually a limitation of OpenMPI not SLURM. See https://slurm.schedmd.com/mpi_guide.html . You could use something like mpich or depending on the cluster a vendor specific implementation. But as Gilles said, report these findings to your sysadmin and theyll likely want you to stick with mpirun — tomgalpin, Feb 13 '23 at 17:00

R on slurm: Rmpi and srun

0 Answers0