2

I have a heterogeneous cluster, containing either 14-core or 16-core CPUs (28 or 32 threads). I manage job submissions using Slurm. Some requirements:

  • It doesn't matter which CPU is used for a calculation.
  • I don't want to specify which CPU a job should go to.
  • A job should consume all available cores on the CPU (14 or 16).
  • I want mpirun to handle threading.

To illustrate the peculiarities of the problem, I show a job script that works on the 16-core CPUs:

#!/bin/bash

#SBATCH -J test
#SBATCH -o job.%j.out
#SBATCH -N 1
#SBATCH -n 32

mpirun -np 16 vasp

An example job script that works on the 14-core CPUs is:

#!/bin/bash

#SBATCH -J test
#SBATCH -o job.%j.out
#SBATCH -N 1
#SBATCH -n 28

mpirun -np 14 vasp

The second job script runs on the 16-core CPUs but, unfortunately, the job is about 35% slower than when I request 32 threads as is done in the first script. That's an unacceptable performance loss for my application.

I haven't figured out if there is a good way around this challenge. To me, a solution would be to request a variable number of resources, such as

#SBATCH -n [28-32]

and to tailor the mpirun -np x vasp line accordingly. I haven't found a way to do this, however. Are there any suggestions on how to achieve this directly in Slurm or is there a good workaround?

I tried to use the environmental variable $SLURM_CPUS_ON_NODE, but this variable is only set after the node is selected, so cannot be used in a #SBATCH line.

I also looked at the --constraint flag but this does not seem to give sufficiently granular control over threading requests.

Scott
  • 35
  • 5
  • 3
    Can you just remove the `#SBATCH -n 14/16` option from your job script, so you are just asking for 1 node of any type. You can then do something like `mpirun -n $SLURM_NTASKS` to pick up the correct number of cores. – AndyT Jan 30 '23 at 20:43
  • are you sure no other job is running on the same node? – Gilles Gouaillardet Jan 30 '23 at 23:42
  • 1
    In addition to what the other commenters said, use `#SBATCH --exclusive` to ensure you're the only user on a node. – ciaron Jan 31 '23 at 11:43

1 Answers1

1

Actually it should work as you want it to by simply specifying that you want a full node:

#!/bin/bash

#SBATCH -J test
#SBATCH -o job.%j.out
#SBATCH -N 1
#SBATCH --exclusive

mpirun vasp

mpirun will start the number of processes as defined in SLURM_TASKS_PER_NODE that will be set by Slurm to the number of tasks that can be created on the node, that is the number of CPUs if you do not request more than one CPU per task.

damienfrancois
  • 52,978
  • 9
  • 96
  • 110
  • Aha! This did it! In absence of the `--exclusive` tag, the node only gave me 1 core. But with `--exclusive` I got all 28 or 32! Thanks! – Scott Feb 02 '23 at 04:24
  • For my specific application (vasp), I needed to specify the number of cores (14/16) dynamically. So my complete `submit.sh` script looked like this: `#!/bin/bash` `#SBATCH -J test` `#SBATCH -o job.%j.out` `#SBATCH -N 1` `#SBATCH --exclusive` `let cores=$SLURM_CPUS_ON_NODE/2` `mpirun -np $cores vasp` – Scott Feb 02 '23 at 04:59