I try to use mpi to run a C application on databricks clusters.
I have downloaded Open MPI from https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-4.0.3.tar.gz
and installed on databricks cluster.
It was built on databricks cluster with Ubuntu.
Operating system/version: Linux 4.4.0 Ubuntu
Computer hardware: x86_64
Network type: databricks
I am trying to run from python notebook on databricks:
%sh
mpirun --allow-run-as-root -np 20 MY_c_Application
The MY_c_Application was written by C and compiled on databricks Linux.
My databricks cluster has 21 nodes with one as driver. Each node has 32 cores.
When I run the above command, I got the error as follows.
Could you please let me know how this could be caused ? Or, do I miss something ?
thanks
There are not enough slots available in the system to satisfy the 20
slots that were requested by the application:
MY_c_application
Either request fewer slots for your application, or make more slots available for use.
A "slot" is the Open MPI term for an allocatable unit where we can launch a process.
The number of slots available are defined by the environment in which Open MPI processes are run:
Hostfile, via "slots=N" clauses (N defaults to number of processor cores if not provided)
The --host command line parameter, via a ":N" suffix on the hostname
(N defaults to 1 if not provided)
Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
If none of a hostfile, the --host command line parameter, or an RM
is present, Open MPI defaults to the number of processor cores In
all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use
the --use-hwthread-cpus option.
Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to launch.
UPDATE
After adding a hostfile , this problem is gone.
sudo mpirun --allow-run-as-root -np 25 --hostfile my_hostfile ./MY_C_APP
thanks