Essentially, I want to know what the recomended way of handling equivalent file extensions is in snakemake. For example, lets say I have a rule that counts the number of entries in a fasta file. The rule might look something like....
rule count_entries:
input:
["{some}.fasta"]
output:
["{some}.entry_count"]
shell:
'grep -c ">" {input[0]} > {output[0]}'
This works great. But what if I want this rule to also permit "{some}.fa" as input?
Is there any clean way to do this?
EDIT:
Here is my best guess at the first proposed sollution. This can probably be turned into a higher order function to be more general purpose but this is the basic idea as I understand it. I don't think this idea really fits any general use case though as it doesn't cooperate with other rules at the "building DAG" stage.
import os
def handle_ext(wcs):
base = wcs["base"]
for file_ext in [".fasta", ".fa"]:
if(os.path.exists(base + file_ext)):
return [base + file_ext]
rule count_entries:
input:
handle_ext
output:
["{base}.entry_count"]
shell:
'grep -c ">" {input[0]} > {output[0]}'
EDIT2: Here is the best current sollution as I see it...
count_entries_cmd = 'grep -c ">" {input} > {output}'
count_entries_output = "{some}.entry_count"
rule count_entries_fasta:
input:
"{some}.fasta"
output:
count_entries_output
shell:
count_entries_cmd
rule count_entries_fa:
input:
"{some}.fa"
output:
count_entries_output
shell:
count_entries_cmd