Let's suppose that I have the following files, on which I want to apply some processing automatically using snakemake:
test_input_C_1.txt
test_input_B_2.txt
test_input_A_2.txt
test_input_A_1.txt
The following snakefile uses expand
to determine all the potential final results file:
rule all:
input: expand("test_output_{text}_{num}.txt", text=["A", "B", "C"], num=[1, 2])
rule make_output:
input: "test_input_{text}_{num}.txt"
output: "test_output_{text}_{num}.txt"
shell:
"""
md5sum {input} > {output}
"""
Executing the above snakefile results in the following error:
MissingInputException in line 4 of /tmp/Snakefile:
Missing input files for rule make_output:
test_input_B_1.txt
The reason for that error is that expand
uses itertools.product
under the hood to generate the wildcards combinations, some of which happen to correspond to missing files.
How to filter out the undesired wildcards combinations?