0

I'm trying to tie scripts from an existing pipeline on docker into my snakemake pipeline. I have the docker pipeline set up using singularity and it works. For instance,

singularity exec docker://mypipeline some_command.sh file.bam out_file.bam

works perfectly when I run it interactively on the command line. Similarly, when I incorporate the exact same command into my Snakefile it also works:

rule myrule:
        input:
                "file.bam"
        output:
                "out_file.bam"
        shell:
                "singularity exec docker://mypipeline some_command.sh {input} {output}"

However, when I try to follow this tutorial https://reproducibility.sschmeier.com/container/index.html#using-a-container-in-our-workflow to incorporate the container into my workflow as follows

singularity: "docker://mypipeline"

rule myrule:
    input:
            "file.bam"
    output:
            "out_file.bam"
    shell:
            "some_command.sh {input} {output}"

And I run snakemake -p --use-singularity --cores 1 I get the following output

Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job counts:
        count   jobs
        1       myrule
        1

[Sun May 17 15:28:11 2020]
rule myrule:
    input: file.bam
    output: out_file.bam
    jobid: 0

some_command.sh file.bam out_file.bam
Activating singularity image myImage.simg

Then I get a very long report that I'm not sure what to make of, followed by this error message

Waiting at most 5 seconds for missing files.
MissingOutputException in line 3 of Snakefile:
Job completed successfully, but some output files are missing. Missing files after 5 seconds:
out_file.bam
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2020-05-17T152810.484310.snakemake.log

My questions:

  • Why does one work and not the other/how can I get the last example to work?
  • Is it good practice to declare singularity: "docker://... upfront or does it not matter?
Delete
  • 11
  • 2

1 Answers1

0

Error message suggests singularity command got executed successfully but snakemake doesn't see the output file. Is the output file out_file.bam shown in your code same as the one you actually use, or you removed some filepath? I would suggest adding --verbose flag to snakemake and reviewing the actual singularity command that snakemake executes.

Manavalan Gajapathy
  • 3,900
  • 2
  • 20
  • 43
  • I did remove some file paths just to make it easier to read. The paths are correct though. I believe I figured out the source of the error: the pipeline I'm using appears to be adding a prefix to the file that I'm creating (and I have not been adding this prefix to the output file name, hence: missing output files). Still troubleshooting this. I'm still curious as to why I don't get the error when I add `singularity exec docker://mypipeline...` infront though, any ideas? – Delete May 22 '20 at 07:21
  • Have you checked how the singularity command executed by snakemake (obtained by using `--verbose` flag) compared to your own `singularity exec` command? That would be my first step to see how/if output filepaths differ. – Manavalan Gajapathy May 22 '20 at 13:53
  • Yes, both are the same as far as I can tell. This is what -p returns: `some_command.sh file.bam out_file.bam` and this is what the --verbose flag returns: `singularity exec --home /mywd --bind /home/miniconda3/envs/myenv/lib/python3.8/site-packages:/mnt/snakemake /mywd/.snakemake/singularity/myImage.simg bash -c 'set -euo pipefail; some_command.sh file.bam out_file.bam'` – Delete May 23 '20 at 06:03