0

Good day, Is there a way to set slurm that when the number of jobs exceed the maximum number of jobs a user have the rest will be performed serially automatically?

For example; I have a maximum of 50 jobs that I can run at the same time. I want to send 200 jobs, meaning that I will have 50 jobs at a time, 4 times.

Please note that each job will be sent separately using srun (I am using a simulation program that can only use this option with slurm).

Thank you in advance.

I want a function that is similar to the task spooler option tsk -s N, but for the cluster.

1 Answers1

0

Assuming you have access to configure the SLURM cluster (i.e. you're not simply a user of the cluster), then you can accomplish what you want by configuring resource limits for SLURM (reference link).

Check out the "MaxJobs" limit.

MaxJobs 

The total number of jobs able to run at any given time for the given association.
If this limit is reached, new jobs will be queued
but only allowed to run after existing jobs in the association
complete.

You'll need to follow the configuration requirements as described in the link. For example, from the SLURM accounting reference:

To enable any limit enforcement you must at least have AccountingStorageEnforce=limits in your slurm.conf. Otherwise, even if you have limits set, they will not be enforced.

Assuming that you do not have access to configure the SLURM cluster (i.e. you're simply a user of the cluster), then you could look into job arrays but that might not be compatible with your "simulation program". From the sbatch man page:

-a, --array=<indexes>

...
 maximum number of simultaneously running tasks from the job array may
 be specified using a "%" separator. For example "--array=0-15%4" will
 limit the number of simultaneously running tasks from this job array
 to 4.
DericS
  • 101
  • 1