4

I am using some python scripts with snakemake to automate the workflow. These scripts take in command line arguments and, while I could replace them with snakemake.input[0], snakemake.output[0], etc, I am reluctant to do so since I'd also like to be able to use them outside of snakemake.

One natural way to solve this problem -- what I have been doing -- is to run them as a shell and not a script. However, when I do this the dependency graph is broken; I update my script and the DAG doesn't think anything needs to be re-run.

Is there a way to pass command line arguments to scripts but still run them as a script?

Edit: an example

My python script

import argparse

parser = argparse.ArgumentParser()
parser.add_argument("-o", type=str)
args = parser.parse_args()

with open(args.o, "w") as file:
    file.write("My file output")

My Snakefile

rule some_rule:
    output: "some_file_name.txt"
    shell: "python my_script.py -o {output}"
mbarete
  • 399
  • 2
  • 17
  • I didn't know that `script` would get snakemake aware of changes in the script (I always use `shell` or a `shell()` call in `run`). Maybe you could make a double interface to your scripts, with exception handling to first `try` to access info from the `snakemake` object, and from the command-line in the `except` clause. – bli Apr 23 '20 at 07:45
  • 2
    Since you seem to be developing the script alongside the snakefile, can you add the python script as an input to the rule? Changes to the modification time of the python script will then trigger reruns of the rules. It's hacky but you can remove it when you are 'done' developing the python code. – Troy Comi Apr 23 '20 at 18:07

3 Answers3

6

Based on comment from @troy-comi, I've been doing the following which -- while a bit of a hack -- does exactly what I want. I define the script as an input to the snakemake rule, which actually can help readability as well. A typical rule (this is not a full MWE) might look like

rule some_rule:
    input:
        files=expand("path_to_files/f", f=config["my_files"]),
        script="scripts/do_something.py"
    output: "path/to/my/output.txt"
    shell: "python {input.script} -i {input.files} -o {output}"

when I modify the scripts, it triggers a re-run; it's readable; and it doesn't require me to insert snakemake.output[0] in my python scripts (making them hard to recycle outside this workflow).

mbarete
  • 399
  • 2
  • 17
3

could you do something as simple an if statement to get parameters from snakemake or command line

that is

if snakemake in globals():
    get parameters from snakemake
else:
    get code some other way
rictuar
  • 74
  • 6
0

Sounds like what you need is use of argparse in your python script. Here is an example, in which python script accepts arguments via commandline:

  • Python script example.py
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("-i", "--infile", help="Filepath")
parser.add_argument("-o", "--outfile", help="Filepath")
args = parser.parse_args()

infilepath = args.infile
outfilepath = args.outfile
# blah blah code
  • Snakefile
rule xx:
input: "in.txt"
output: "out.txt"
shell: "python example.py -i {input} -o {output}"

PS - When I'm lazy, I just use Fire library instead of argparse. Fire easily exposes functions/classes to commandline with few lines of code.

Manavalan Gajapathy
  • 3,900
  • 2
  • 20
  • 43