How to immediately submit all Snakemake jobs to slurm cluster

Question

I'm using snakemake to build a variant calling pipeline that can be run on a SLURM cluster. The cluster has login nodes and compute nodes. Any real computing should be done on the compute nodes in the form of an srun or sbatch job. Jobs are limited to 48 hours of runtime. My problem is that processing many samples, especially when the queue is busy, will take more than 48 hours to process all the rules for every sample. The traditional cluster execution for snakemake leaves a master thread running that only submits rules to the queue after all the rule's dependencies have finished running. I'm supposed to run this master program on a compute node, so this limits the runtime of my entire pipeline to 48 hours.

I know SLURM jobs have dependency directives that tell a job to wait to run until other jobs have finished. Because the snakemake workflow is a DAG, is it possible to submit all the jobs at once, with each job having its dependencies defined by the rule's dependencies from the DAG? After all the jobs are submitted the master thread would complete, circumventing the 48 hour limit. Is this possible with snakemake, and if so, how does it work? I've found the --immediate-submit command line option, but I'm not sure if this has the behavior I'm looking for and how to use the command because my cluster prints Submitted batch job [id] after a job is submitted to the queue instead of just the job id.

Sort of related. I tend to run my pipeline on login/master node using [screen](https://linux.die.net/man/1/screen) (and nohup, sometimes), as snakemake shouldn't be computationally heavy (snakemake's author [agrees on this](https://bitbucket.org/snakemake/snakemake/issues/753/using-immediate-submit-jobscripts-get#comment-43175812)). — Manavalan Gajapathy, Dec 30 '19 at 15:20
`--immediate-submit` does appear to do what you are after. Can you show the snakemake command you are using? What's the snakemake version? — Manavalan Gajapathy, Dec 30 '19 at 15:23
I tend to agree with @ManavalanGajapathy, re: running from login node. I'd just note that if you are using Conda environments, you would want to run `snakemake --create-envs-only --use-conda` first on a compute node to pre-create the Conda envs; otherwise that step might a bit demanding for a login node. — merv, Dec 30 '19 at 16:13
@ManavalanGajapathy I will have to try running `snakemake` on the login node as I haven't officially tried it yet. I'd prefer to use this approach as `--immediate-submit` doesn't support temporary files, but my cluster's user guide says there is a time limit on login node processes. I'll have to see if it's enforced :) — Tomas Bencomo, Dec 31 '19 at 02:53

Maarten-vd-Sande · Accepted Answer · 2019-12-31T08:19:17.230

Immediate submit unfortunately does not work out-of-the-box, but needs some tuning for it to work. This is because the way dependencies between jobs are passed along differ between cluster systems. A while ago I struggled with the same problem. As the immediate-submit docs say:

Immediately submit all jobs to the cluster instead of waiting for present input files. This will fail, unless you make the cluster aware of job dependencies, e.g. via: $ snakemake –cluster ‘sbatch –dependency {dependencies}. Assuming that your submit script (here sbatch) outputs the generated job id to the first stdout line, {dependencies} will be filled with space separated job ids this job depends on.

So the problem is that sbatch does not output the generated job id to the first stdout line. However we can circumvent this with our own shell script:

parseJobID.sh:

#!/bin/bash
# helper script that parses slurm output for the job ID,
# and feeds it to back to snakemake/slurm for dependencies.
# This is required when you want to use the snakemake --immediate-submit option

if [[ "Submitted batch job" =~ "$@" ]]; then
  echo -n ""
else
  deplist=$(grep -Eo '[0-9]{1,10}' <<< "$@" | tr '\n' ',' | sed 's/.$//')
  echo -n "--dependency=aftercorr:$deplist"
fi;

And make sure to give the script execute permission with chmod +x parseJobID.sh.

We can then call immediate submit like this:

snakemake --cluster 'sbatch $(./parseJobID.sh {dependencies})' --jobs 100 --notemp --immediate-submit

Note that this will submit at max 100 jobs at the same time. You can increase or decrease this to any number you like, but know that most cluster systems do not allow more than 1000 jobs per user at the same time.

Thanks for the easy and simple answer! I had to fix the permissions with `chmod +x parseJobID.sh` before running the `snakemake` command - the need to do this may differ across clusters. — Tomas Bencomo, Dec 31 '19 at 02:48
I also found [this tutorial](https://ulhpc-tutorials.readthedocs.io/en/latest/bio/snakemake/#immediate_submit) from the University of Luxemborg's HPC website that describes a similar python script for handling the job id dependencies with extra configuration options. Its in the section titled "(Optional) Immediately submit all jobs" — Tomas Bencomo, Dec 31 '19 at 02:50
No that is my bad, you will always need to give it execute permission. I have changed it in the answer. — Maarten-vd-Sande, Dec 31 '19 at 08:17

How to immediately submit all Snakemake jobs to slurm cluster

1 Answers1