0

How can I set the maximum number of CPUs each job can ask for in Slurm?

We're running a GPU cluster and want a sensible number of CPUs to be always available for GPU jobs. This is kind of fine as long as the job asks for GPUs because there's GPU <-> CPU mapping in the gres.conf. But this doesn't stop a job that doesn't ask for any GPUs not to acquire all CPUs in the system.

Milad
  • 4,901
  • 5
  • 32
  • 43

1 Answers1

1

To set the maximum number of CPUs a single job can use, at the cluster level, you can run the following command:

sacctmgr modify cluster <cluster_name> set maxtresperjob=cpu=<nb of CPUs>

Note that you must have SelectType=select/cons_tres in your configuration file for this to work.

Alternatively the same restriction can be applied partition-wise, QOS-wise, account-wise, etc.

damienfrancois
  • 52,978
  • 9
  • 96
  • 110