Questions tagged [slurm]

Slurm (formerly spelled SLURM) is an open-source resource manager designed for Linux HPC clusters of all sizes.

Slurm: A Highly Scalable Resource Manager

Slurm is an open-source resource manager designed for Linux clusters of all sizes. It provides three key functions. First it allocates exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (typically a parallel job) on a set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work.

Slurm's design is very modular with dozens of optional plugins. In its simplest configuration, it can be installed and configured in a couple of minutes (see Caos NSA and Perceus: All-in-one Cluster Software Stack by Jeffrey B. Layton) and was used by Intel on their 48-core "cluster on a chip". More complex configurations can satisfy the job scheduling needs of world-class computer centers and rely upon a MySQL database for archiving accounting records, managing resource limits by user or bank account, or supporting sophisticated job prioritization algorithms.

While other resource managers do exist, Slurm is unique in several respects:

  • It is designed to operate in a heterogeneous cluster counting over 100,000 nodes and millions of processors.
  • It can sustain a throughput rate of hundreds of thousands jobs per hour with bursts of job submissions at several times that rate.
  • Its source code is freely available under the GNU General Public License.
  • It is portable; written in C and using the GNU autoconf configuration engine. While initially written for Linux, other UNIX-like operating systems should be easy porting targets.
  • It is highly tolerant of system failures, including failure of the node executing its control functions.
  • A plugin mechanism exists to support various interconnects, authentication mechanisms, schedulers, etc. These plugins are documented and simple enough for the motivated end user to understand the source and add functionality.
  • Configurable node power control functions allow putting idle nodes into a power-save/power-down mode. This is especially useful for "elastic burst" clusters which expand dynamically to a cloud virtual machine (VM) provider to accommodate workload bursts.

Resources and Tutorials:

Name Spelling

As of v18.08, the name spelling “SLURM” has been changed to “Slurm” (commit 3d7ada78e).

Other Uses of the Name

Slurm also a fictional soft drink in the Futurama multiverse where it is popular and highly addictive.

1738 questions
0
votes
1 answer

Unexpected EOF looking for matching `"'... in line 1. What gives?

I am running the following slurm script on a cluster computing system. #!/bin/bash # #SBATCH --job-name=R1.5-CG-nvtrun # create a short name for your job #SBATCH --qos=short # _quality of service_ #SBATCH --nodes=1 …
megamence
  • 335
  • 2
  • 10
0
votes
1 answer

Get available memory inside SLURM step

I'm trying to generate a script that automatically adapt its requirements to whatever is the environment where it is running. I already got the number of CPUs available by accessing the SLURM_CPUS_PER_TASK environment variable. If it does not…
Poshi
  • 5,332
  • 3
  • 15
  • 32
0
votes
1 answer

separating values of CUDA_VISIBLE_DEVICES variable

I am running a job in a cluster that uses SLURM as a scheduler. I specify the type of GPU card with the option --gres=gpu:k80. However, because the cluster has nodes with a different number of cards, it happens that sometimes one gets 2 or 4. I can…
armando
  • 1,360
  • 2
  • 13
  • 30
0
votes
0 answers

How to add more processes and nodes to a running MPI job through SLURM?

In my university cluster, resources are scarce. Sometimes I run the program on fewer nodes. I want to know if there is a possibility of adding more nodes as more nodes become available. My program is a simple master-slave program and I run it…
Harsh M
  • 625
  • 2
  • 11
  • 25
0
votes
1 answer

Can I see if finished SLURM job were on dependency?

Is there a way to find out if jobs that I already ran with SLURM workload manager where on a dependency ? The sacct command has an option "pending" which should show if a job was on hold, but in my case it just prints out all jobs: sacct -M cm2…
zweiHuehner
  • 83
  • 1
  • 6
0
votes
0 answers

How do I catch and get useful information from mpi4py programs

I'm just learning MPI by using mpi4py and got the following error after an overnight run. The primary job didn't get to the final problem size I had specified so something terminated early. If it had been an unhandled exception, I would have…
Elros
  • 313
  • 1
  • 3
  • 10
0
votes
2 answers

python3 multiprocessing map_async not running all tasks

My code looks like this: import multiprocessing as mp def myfunc(args): val1, val2 = args print("now running", val1, val2, "node=", mp.current_process().name) do_stuff() return (args, output) args = [(1,2), (2,3), (4,5), ...] #…
0
votes
1 answer

job submission issues with the Slurm Workload Manager

I am using a computer cluster with 20 nodes and each node has 16 CPU. I tried to submit 1000 jobs to all nodes with the command "sbatch XX.sbatch". What I want is that 320 jobs are running simultaneously, i.e., 16 jobs per node, or 1 job per…
Lenoir
  • 113
  • 5
0
votes
0 answers

JobHeldAdmin Status on Slurm when submitting Snakemake pipeline

when submitting to Slurm using the following snakemake --cluster "sbatch --mem-per-cpu=16G --ntasks=4 --mail-type=ALL --partition=normal_q --time=48:00:00 --nice=100000" --jobs=100 --use-conda some of my jobs get "JobHeldAdmin"-Status. Does anyone…
0
votes
1 answer

How do I pass my Python script into Slurm sbatch?

I have written I python script that I am meant to pass onto my University's Slurm sbatch system for computing. I have written a short shell script that is supposed to just enter the python script into the sbatch system, but I get an error that…
Marcus K.
  • 301
  • 1
  • 3
  • 9
0
votes
0 answers

What am I missing getting mpirun to schedule across multiple nodes?

TL;DR; I'm having troubles getting MPI to schedule jobs across more than a single node. There seems to be a communication error between nodes at the MPI level that isn't a problem for TCP or at the Slurm level. Ultimately, I seem to be missing…
Elros
  • 313
  • 1
  • 3
  • 10
0
votes
1 answer

How to append memory usage for each step within a shell script in slurm output

I have a bash script: #!/bin/bash time srun -p my_partition -c 1 --mem=4G my_code -i my_file_1 -o my_output_file_1 time srun -p my_partition -c 1 --mem=4G my_code -i my_file_2 -o my_output_file_2 time srun -p my_partition -c 1 --mem=4G my_code -i…
Bot75
  • 179
  • 8
0
votes
0 answers

Slurm MPI Error: An ORTE Daemon has failed

I have been having some issues with Slurm and openMPI on a cluster. Whenever I run any job which uses mpirun, I get the following error: -------------------------------------------------------------------------- An ORTE daemon has unexpectedly…
Alec Bills
  • 1
  • 1
  • 2
0
votes
1 answer

Dask +SLURM over ftp mount (CurlFtpFS)

So I have a working DASK/SLURM cluster of 4 raspberry Pis with a common NFS share, that I can run Python jobs succesfully. However, I want to add some more arm devices to my cluster that do not support NFS mounts (Kernel module missing) so I wish to…
vzografos
  • 105
  • 5
0
votes
1 answer

Slurm : Incomplete job state save file, start with '-i' to ignore this

I changed in my conf file the SelectType=select/linear to SelectType= select/cons_res in order to be able to run multiple jobs in same time on my partition. But when i try to apply my changes, i get this error message : fatal: Incomplete job state…