4

The scripts I used all put the output files to the current directory where the script was called so in my shell script pipeline I would have cd commands to go to a particular directory to run commands and output files will just be saved in relevant directories. My scripts don't have the parameter for output directory and most of them get the output file names deduced from the input. That has worked pretty well for me.

Now I'm running into this output directory issue consistently as snakemake seem to output the files to the directory where Snakefile is. I could modify all the scripts to take in an additional parameter for output directory but that's gone be a pain for modifying many scripts. I'm wondering if there is any way to specify where the output should go for each specific rule?

olala
  • 4,146
  • 9
  • 34
  • 44

4 Answers4

3

One hack would be to first cd into the output directory, i.e. "cd $(dirname {output[0]})". This needs to be the first in your shell commands.

Having said this, it would be better to change the script to accept an output directory as argument.

Andreas

Andreas
  • 716
  • 4
  • 14
3

Here is an example rule that I use in one of my snakefiles:

rule link_raw_data:
    output:
        OPJ(data_dir, "{lib}_{rep}.fastq.gz"),
    params:
        directory = data_dir,
        shell_command = lib2data,
    message:
        "Making link to raw data {output}."
    shell:
        """
        (
        cd {params.directory}
        {params.shell_command}
        )
        """

This is probably a bit different from your situation, but hopefully some of the techniques can help. In particular, note the parentheses in the shell section and the usage of a params section to define the output directory.

I'm not sure I'm doing this in the most elegant way, but it works.

data_dir is a parameter read from a config file.

lib2data is a function that generates commands based on the values of some wildcards. I have to ensure that these commands use the correct input file paths of course (and, in this case, also the output in a coherent manner with what the output section says). In your case, it is possible that you will simply have a "hard-coded" shell commands, possibly using some of the rule's input.

More streamlined example

rule run_script1:
    input:
        path/to/initial/input
    output:
        script1_out/output1
    shell:
        """"
        cd script1_out
        script1 {input}
        """"

rule run_script2:
    input:
        script1/output1
    output:
        script2/output2
    shell:
        """
        cd script2_out
        script2 {input}
        """

Starting from these examples, you can use functions of the wildcards in the input or output if necessary.

bli
  • 7,549
  • 7
  • 48
  • 94
  • thanks, i'm wondering what do the parentheses mean in the shell section? – olala Dec 07 '16 at 19:25
  • Actually, I realize that in this context, the parentheses are useless because there is no other commands after them. The commands after the closing parenthesis would happen in the working directory as it is before the `cd`. – bli Dec 07 '16 at 20:48
  • you mean parentheses group the commands inside into one block and they will be executed together and thus in the params.directory? outside of parentheses, other commands works in the working directory? – olala Dec 07 '16 at 21:19
  • Yes, that's how they could be useful, but my example is not relevant in this regard. – bli Dec 07 '16 at 21:52
2

In snakemake documentation:

"All paths in the snakefile are interpreted relative to the directory snakemake is executed in. This behaviour can be overridden by specifying a workdir in the snakefile:"

workdir: "path/to/workdir"

So just put that at the begining of your snakefile and all inputs and outputs will be interpreted relative to this path.

Eric C.
  • 3,310
  • 2
  • 22
  • 29
1

You could try to use a configuration file either in YAML or JSON maybe. Then use the directory as a parameter in your expand or in the input/output of your rules.

See the documentation here

rioualen
  • 948
  • 8
  • 17
  • i don't think this will work as i still need to pass the parameter into the script and my script doesn't take that parameter yet – olala Dec 04 '16 at 20:57
  • You can use the parameter in the `shell` section, as in my answer: http://stackoverflow.com/a/40998525/1878788. – bli Dec 08 '16 at 17:25