5

Is there a way to reuse a rule in snakemake changing only the params?

For instance:

rule job1:
    ...
    params:
        reference = "path/to/ref1"
    ...

rule job2:
    input: rules.job1.output
    ...
    params:
        reference = "path/to/ref2"

job1 and job2 rules are doing the same stuff, but I need to call them successively and the reference parameter has to be modified. It generates lot's of code for a very similar task.

I tried to make a sub-workflow for this step, and the main Snakefile is more readable. However, the sub-workflow code still repeated.

Any idea or suggestion? Did I miss something?

EDIT
To be more specific, job2 has to be executed after job1, using the output of the latter.

glihm
  • 1,138
  • 13
  • 29

3 Answers3

3

If the rule is the same, you could just use wildcards in the naming of the output files. This way, the same rule will be executed twice:

references = ["ref1", "ref2"]

rule all:
  input: expand("my_output_{ref}", ref=references)

rule job:
    input: "my_input"
    output: "my_output_{ref}"
    params: ref = "path/to/{ref}"
    shell: "... {params.ref} {output}"

Hope this helps, if not could you maybe make your question more specific?

Edit

Okay, there's the possibility of defining custom inputs for a rule, by using a python function. Maybe you could into this direction. See this working example:

references = ["ref1", "ref2"]

rule all:
  input: expand("my_output_{ref}", ref=references)


def rule_input(wildcards):
    if (wildcards.ref == "ref1"):
        input = "my_first_input"
    elif (wildcards.ref == "ref2"):
        input = "my_output_ref1"
    return(input)

rule job:
    input: rule_input
    output: "my_output_{ref}"
    params: ref = "path/to/{ref}"
    shell: "echo input: {input} ; echo output: {output} ; touch {output}"
rioualen
  • 948
  • 8
  • 17
  • Thanks to reply, for sure `expand` key word can be used in that way, but it assumes that both of `job` rule execution take the same input. I edit my post in order to be more precise as you mentioned. ;) – glihm Dec 07 '16 at 08:55
  • It is possible to use a python function to define custom inputs in a rule, maybe you can look in this direction then: – rioualen Dec 07 '16 at 11:37
  • `references = ["ref1", "ref2"] rule all: input: expand("my_output_{ref}", ref=references) def rule_input(wildcards): if (wildcards.ref == "ref1"): input = "my_first_input" elif (wildcards.ref == "ref2"): input = "my_output_ref1" return(input) rule job: input: rule_input output: "my_output_{ref}" params: ref = "path/to/{ref}" shell: "echo input: {input} ; echo output: {output} ; touch {output}" – rioualen Dec 07 '16 at 11:37
  • `references = ["ref1", "ref2"] rule all: input: expand("my_output_{ref}", ref=references) def rule_input(wildcards): if (wildcards.ref == "ref1"): input = "my_first_input" elif (wildcards.ref == "ref2"): input = "my_output_ref1" return(input) rule job: input: rule_input output: "my_output_{ref}" params: ref = "path/to/{ref}" shell: "echo input: {input} ; echo output: {output} ; touch {output}" ` – rioualen Dec 07 '16 at 11:37
  • (sorry for flooding, I have no idea how to edit/remove comments...) – rioualen Dec 07 '16 at 11:43
  • You solution works like a charm! I knew the input functions but I didn't think this kind of combination can work. Thank you! – glihm Dec 07 '16 at 11:51
  • Glad I could help :-) Good luck! – rioualen Dec 07 '16 at 13:26
2

Snakemake does not support inheritance between rules. So, if you cannot use wildcards for your case, one option is to use delegation. You predefine the constant parts of the two rules into variables, and then refer to these from the rule bodies. Another option might be input/params functions (see docs and tutorial) and only one rule definition. This way, you can have completely different input files, following some logic based on the value of e.g. a wildcard.

Johannes Köster
  • 1,809
  • 6
  • 8
2

In case someone finds this 6 years later like I did: Snakemake version 6 and above supports rule inheritence, which makes the solution to the OP quite simple:

use rule job1 as job2 with:
    input: rules.job1.output
    ...
    params:
        reference = "path/to/ref2"
trutheality
  • 23,114
  • 6
  • 54
  • 68