6

I am looking for a way to shutdown/exit/halt a running snakemake workflow programmatically - essentially with a python function that is called in the workflow but may run into an unrecoverable error requiring the workflow to stop for human intervention.

What I am actually trying to do: I start (guppy basecaller) jobs on GPU nodes, and have to specify in the command which cuda core to use. The function checks if lock files exist to specify which cores are in use and which are available. The files are created and removed as part of the shell command of the basecaller. Using a resource the number of parallel gpu jobs is limited to the available number of cores. This works, but I want to be able to catch unexpected problems if e.g. a gpu_lock file got removed or not cleaned.

The function is called in the workflow to specify a parameter, e.g. as the dummy below:


def get_fromel(wildcards):
  if some_number < 0.05:
    sys.exit("yieeeks")
  else:
    return "hiyaaa"

rule foo:
  input: bar.txt
  output: baz.txt
  params: 
     fromel = get_fromel
  shell:
     "fizz -f {params.fromel} {input} > {output}


Do I simply call sys.exit("my message")? I am worried that it will not clean up incomplete files etc

Wouter De Coster
  • 409
  • 3
  • 17

2 Answers2

1

UPDATE after Wouter's edits:

In my understanding, snakemake evaluates fromel = get_fromel during the DAG construction, so before starting any job. It also evaluates it when the job is actually executed but necessarily before executing the shell directive. Either way, there shouldn't be pending corrupted files.

In any case, I would write a toy snakefile to check snakemake behaves as you expect.


Maybe you should give more detail of what you are trying to achieve in order to get more precise answers. Anyway, if you put sys.exit() inside a rule, like this:

rule all:
    input:
        'done.txt',

rule one:
    output:
        'done.txt',
    run:
        import random
        shell('touch {output}')
        x = random.random()
        if x < 0.5:
            sys.exit('Exiting')

then the rule will programmatically exit (fail) if x < 0.5. Snakemake will let running job complete and clean up files produced by rule one as potentially corrupted.

If on the other hand you put sys.exit() outside a rule, like:

rule all:
   ...
rule one:
   ...

sys.exit()

then the pipeline will not start at all because it will hit sys.exit before starting any job.

Note that you will need snakemake >= 7.15.2 for seeing errors produced by the run directive (bug introduced around v7.8).

Also, sys.exit() will give you an ugly stacktrace. I vaguely remember a hack to work around it which I can try to dig out if that bothers you.

dariober
  • 8,240
  • 3
  • 30
  • 47
  • Thanks! I have updated my question with more information, I would call sys.exit() in a function called by the rule to set the params... – Wouter De Coster Oct 12 '22 at 11:44
1

Instead of calling sys.exit(), why not raise an Exception in your python function and let Snakemake stop itself? Like this, other running jobs will finish and Snakemake will clean up everything properly.

def get_fromel(wildcards):
  if some_number < 0.05:
    raise Exception("Something went wrong")
  else:
    return "hiyaaa"

rule foo:
  input: "bar.txt"
  output: "baz.txt"
  params: 
     fromel = get_fromel
  shell:
     "fizz -f {params.fromel} {input} > {output}"

In the example case you provided, if some_number is < 0.05, Snakemake will not even start the run. But if the check is dynamic and depends on files in your workflow, then this should work.

D-Cru
  • 33
  • 5