3

I got access to an HPC cluster with a MPI partition.

My problem is that -no matter what I try- my code (which works fine on my PC) doesn't run on the HPC cluster. The code looks like this:

library(tm) library(qdap) library(snow) library(doSNOW) library(foreach)

> cl<- makeCluster(30, type="MPI")
> registerDoSNOW(cl)
> np<-getDoParWorkers()
> np
> Base = "./Files1a/"
> files = list.files(path=Base,pattern="\\.txt");
> 
> for(i in 1:length(files)){
...some definitions and variable generation...
+ text<-foreach(k = 1:10, .combine='c') %do%{
+   text= if (file.exists(paste("./Files", k, "a/", files[i], sep=""))) paste(tolower(readLines(paste("./Files", k, "a/", files[i], sep=""))) , collapse=" ") else ""
+ }
+ 
+ docs <- Corpus(VectorSource(text))
+ 
+ for (k in 1:10){
+ ID[k] <- paste(files[i], k, sep="_")
+ }
+ data <- as.data.frame(docs) 
+ data[["docs"]]=ID
+ rm(docs)
+ data <- sentSplit(data, "text")
+ 
+ frequency=NULL
+ cs <- ceiling(length(POLKEY$x) / getDoParWorkers()) 
+ opt <- list(chunkSize=cs) 
+ frequency<-foreach(j = 2: length(POLKEY$x), .options.mpi=opt, .combine='cbind') %dopar% ...
+ write.csv(frequency, file =paste("./Result/output", i, ".csv", sep=""))
+ rm(data, frequency)
+ }

When I run the batch job the session gets killed at the time limit. Whereas I receive the following message after the MPI cluster initialization:

Loading required namespace: Rmpi
--------------------------------------------------------------------------
PMI2 initialized but returned bad values for size and rank.
This is symptomatic of either a failure to use the
"--mpi=pmi2" flag in SLURM, or a borked PMI2 installation.
If running under SLURM, try adding "-mpi=pmi2" to your
srun command line. If that doesn't work, or if you are
not running under SLURM, try removing or renaming the
pmi2.h header file so PMI2 support will not automatically
be built, reconfigure and build OMPI, and then try again
with only PMI1 support enabled.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
An MPI process has executed an operation involving a call to the
"fork()" system call to create a child process.  Open MPI is currently
operating in a condition that could result in memory corruption or
other system errors; your MPI job may hang, crash, or produce silent
data corruption.  The use of fork() (or system() or other calls that
create child processes) is strongly discouraged.  

The process that invoked fork was:

  Local host:         ...
  MPI_COMM_WORLD rank: 0

If you are *absolutely sure* that your application will successfully
and correctly survive a call to fork(), you may disable this warning
by setting the mpi_warn_on_fork MCA parameter to 0.
--------------------------------------------------------------------------
    30 slaves are spawned successfully. 0 failed.

Unfortunately, it seems that the loop doesn't go through once as no output is returned.

For the sake of completeness, my batch file:

#!/bin/bash -l
#SBATCH --job-name MyR
#SBATCH --output MyR-%j.out
#SBATCH --nodes=5
#SBATCH --ntasks-per-node=6
#SBATCH --mem=24gb
#SBATCH --time=00:30:00

MyRProgram="$HOME/R/hpc_test2.R"

cd $HOME/R

export R_LIBS_USER=$HOME/R/Libs2

# start R with my R program
module load R

time R --vanilla -f $MyRProgram

Does anybody have a suggestion how to solve the problem? What am I doing wrong?

Thanks in advance for your help!

C. G.
  • 53
  • 7

1 Answers1

3

Your script is an MPI application, so you need to execute it appropriately via Slurm. The Open MPI FAQ has a special section on how to do that:

https://www.open-mpi.org/faq/?category=slurm

The most important point is that your script shouldn't execute R directly, but should execute it via the mpirun command, using something like:

mpirun -np 1 R --vanilla -f $MyRProgram

My guess is that the "PMI2" error is caused by not executing R via mpirun. I don't think the "fork" message indicates a real problem and it happens to me at times. I think it happens because R calls "fork" when initializing, but this has never caused a problem for me. I'm not sure why I only get this message occasionally.

Note that it is very important to tell mpirun to only launch one process since the other processes will be spawned, so you should use the mpirun -np 1 option. If Open MPI was properly built with Slurm support, then Open MPI should know where to launch those processes when they are spawned, but if you don't use -np 1, then all 30 processes launched via mpirun will spawn 30 processes each, causing a huge mess.

Finally, I think you should tell makeCluster to spawn only 29 processes to avoid running a total of 31 MPI processes. Depending on your network configuration, even that much oversubscription can cause problems.

I would create the cluster object as follows:

library(snow)
library(Rmpi)
cl<- makeCluster(mpi.universe.size() - 1, type="MPI")

That's safer and makes it easier to keep your R script and job script in sync with each other.

Steve Weston
  • 19,197
  • 4
  • 59
  • 75
  • Thanks for the help! I tried the code again with srun -n 1 and the fork() warning is gone. However, now a new problem came up as the system decreased the number of nodes to 1; error message at the beginning of the output says: "srun.bin: Warning: can't run 1 processes on 5 nodes, setting nnodes to 1". Consequently, an allocation of all resources is not possible. – C. G. Dec 08 '15 at 09:33
  • @C.G. I didn't suggest "srun -n 1": I suggested "mpirun -n 1". Request as many nodes and cores as you want using "srun", but launch your R script using 'mpirun -n 1'. The remaining processes will be spawned when you execute `makeCluster` in the R script. – Steve Weston Dec 08 '15 at 13:26
  • @C.G. I edited my answer to use the mpirun -np option for consistency with the Slurm documentation and avoid confusion with srun options. Hopefully it is more clear now. – Steve Weston Dec 08 '15 at 13:48
  • Sorry, I didn't get that. I changed to mpirun -np1 ... now and the first error message disappeared. Unfortunately, the error message "An MPI process has executed an operation involving a call to the "fork()" system" came up again. Moreover, I found out that the code works if I use foreach%do% instead of %dopar%. – C. G. Dec 08 '15 at 15:05
  • Thanks so much. Problem solved. I had to change makeCluster to makeMPIcluster though but your reply got the solution. – C. G. Dec 08 '15 at 19:39
  • `fork()`-ing processes is unsafe when using network interconnects that use registered memory, e.g. InfiniBand. As long as `fork()` is used to spawn child processes by immediately being followed by a variant of `execve()` in the child, its use is safe. – Hristo Iliev Dec 09 '15 at 12:58