A cluster is a group of node, each node is an independent computer (a bunch of CPUs and some GPUs or other accelerators), then the nodes are connected by a network (it worth noting that the memory addresses are usually global in supercomputers). Then you have two type of supercomputer: shared memory and distributed memory.
It worth reading a bit on supercomputer architecture... Wikipedia is a good starting point!
A process is an independent work unit. process does not share memory, they need a way to access the memory of each other, to do so you use library such as MPI.
In slurm a process is called a task...
To set the number of tasks (processes in fact) you use
-ntasks or simply -n
Then you can set the number of task per node or the number of node. This are 2 different things!
--ntasks-per-node give you the number of task per node
--nodes gives you the minimum number of nodes you want.
If you specify that --nodes=2 it means you will have minimum 2 nodes, but it might be more... if your nodes have 18 cores, and you ask for 40 tasks, then you need at least 3 nodes... thats why one should avoid using --nodes (except if you know what you are doing!)
Then a given number of CPU (cores of your processor) can be allocated to a single task. this is set using --cpu-per-task.
One MPI rank is one task. Then a task can launch multiple thread. If you set --cpu-per-task to one, all those thread will run on the same core. And therefore compete for the resource. Usually you want to have one thread per core (or 2 if you use hyperthreading).
When you set --cpu-per-task, it HAVE TO be a smaller number of core per node, as a task can run only on a single node! (on a distributed memory system).
To summarize:
So if you want to run M mpi processes which will lunch N thread each. First N must be smaller than the number of core per node, better to be a integer divider of the number of core per node (otherwise you will waist some cores).
You will set:
--ntasks="M"
--cpus-per-task="N"
Then you will run using:
srun ./your_hybrid_app
Then do not forget 2 things:
If you use OpenMP: Set the number of thread:
export OMP_NUM_THREADS="N"
and dont forget to initialize MPI properly for multithreading...
!/bin/bash -l
#
#SBATCH --account=myAccount
#SBATCH --job-name="a job"
#SBATCH --time=24:00:00
#SBATCH --ntasks=16
#SBATCH --cpus-per-task=4
#SBATCH --output=%j.o
#SBATCH --error=%j.e
export OMP_NUM_THREADS=4
srun ./your_hybrid_app
This will lauch 16 tasks, with 4 cores per task (and 4 OMP threads per task, so one per core).