What would be an elegant way of preventing snakemake from failing upon shell/R error?

Question

I would like to be able to have my snakemake workflows continue running even when certain rules fail.

For example, I'm using a variety of tools in order to perform peak-calling of ChIP-seq data. However, certain programs issue an error when they are not able to identify peaks. I would prefer to create an empty output file in such cases, and not having snakemake fail (like some peak-callers already do).

Is there a snakemake-like way of handling such cases, using the "shell" and "run" keywords?

Thanks

You can always manually manage failures, by detecting them and generating dummy output. However, it becomes messy when there are downstream rules, that need to be adapted to the possibility of getting those dummy files as input (I've done that for pairs of computing/plotting rules, generating empty data files, then empty pdfs). It would be nice to have a way to tell snakemake that some output can be an empty file or with a default content, and automatically "propagate" this downstream. — bli, Jul 18 '19 at 14:48
If you don't need the empty files, snakemake can ignore failures with the `-k` flag. — ate50eggs, Dec 08 '20 at 23:10

tomkinsc · Accepted Answer · 2017-08-10T22:53:09.730

For shell commands, you can always take advantage conditional "or", ||:

rule some_rule:
    output:
        "outfile"
    shell:
        """
        command_that_errors || true
        """

# or...

rule some_rule:
    output:
        "outfile"
    run:
        shell("command_that_errors || true")

Usually an exit code of zero (0) means success, and anything non-zero indicates failure. Including || true ensures a successful exit when the command exits with a non-zero exit code (true always returns 0).

If you need to allow a specific non-zero exit code, you can use shell or Python to check the code. For Python, it would be something like the following. The shlex.split() module is used so shell commands do not need to passed as arrays of arguments.

import shlex

rule some_rule:
    output:
        "outfile"
    run:
        try:
           proc_output = subprocess.check_output(shlex.split("command_that_errors {output}"), shell=True)                       
        # an exception is raised by check_output() for non-zero exit codes (usually returned to indicate failure)
        except subprocess.CalledProcessError as exc: 
            if exc.returncode == 2: # 2 is an allowed exit code
                # this exit code is OK
                pass
            else:
                # for all others, re-raise the exception
                raise

In shell script:

rule some_rule:
    output:
        "outfile"
    run:
        shell("command_that_errors {output} || rc=$?; if [[ $rc == 2 ]]; then exit 0; else exit $?; fi")

If your example rule `some_rule` fails, wouldn't the rule still fail because output file is missing? Or, is there supposed to be a command `touch {output}` there? — Manavalan Gajapathy, May 10 '19 at 02:32
I agree with @ManavalanGajapathy I would expect problems if the file is not created. What about `shell("command_that_errors || touch {output[0]}")`? — bli, Jul 18 '19 at 14:39

What would be an elegant way of preventing snakemake from failing upon shell/R error?

1 Answers1

Linked

Related