Questions tagged [snakemake]

Snakemake is a workflow management system with a Python-style specification language.

Snakemake is a workflow management system with a Python-style specification language.

1634 questions
3
votes
1 answer

snakemake - how to make a list of input files based on a previous rule that produces variable number of files

Say, I am starting with a bunch of files like these: group_1_in.txt, group_2_in.txt, group_3_in.txt I process them using a rule that generates the directory structure shown below. rule process_group_files: input: 'group_{num}_in.txt' …
vkkodali
  • 630
  • 7
  • 18
3
votes
1 answer

Snakemake: how to realize a mechanism to copy input/output files to/from tmp folder and apply rule there

We use Slurm workload manager to submit jobs to our high performance cluster. During runtime of a job, we need to copy the input files from a network filesystem to the node's local filesystem, run our analysis there and then copy the output files…
Matthias Munz
  • 3,583
  • 4
  • 30
  • 47
3
votes
0 answers

Run parallelized snakemake jobs in a specific order

I'm currently building a snakemake pipeline, where I need to: 1 - Split reference genomes according to chromosome. 2 - Do an unrelevant operation on them. 3 - Merge the chromosomes back into a full assembly. Here is a snippet of what's happening…
Oneiros
  • 291
  • 3
  • 6
3
votes
1 answer

snakemake temporary directories

snakemake deletes all output files that are marked temporary but does not do anything to the files if the output is a directory as shown below: rule all: input: 'final.txt', checkpoint split_big_file: input: 'bigfile.txt' …
vkkodali
  • 630
  • 7
  • 18
3
votes
2 answers

How to specify snakemake wrapper with relative path?

I am trying to use some wrappers that I define in my pipeline directory (same directory as the Snakefile) rather the wrapper repo. I have looked at the docs for this and it works fine when I use an absolute path, but I can't get a relative path to…
vantom
  • 454
  • 4
  • 11
3
votes
1 answer

How to refer to executable inside anaconda environment in Snakemake

I'm using vcf2maf to annotate variants as part of a snakemake pipeline rule vcf2maf: input: vcf="vcfs/{sample}.vcf", fasta=vep_fasta, vep_dir=vep_dir output: "mafs/{sample}.maf" conda: …
Tomas Bencomo
  • 349
  • 1
  • 9
3
votes
1 answer

Snakemake: Can you expand on two dependent variables?

I'm running associations for a list of genes and markers. I have a list of genes genes = ['gene1', 'gene2', ...] and a dictionary where the keys are gene names and the values are lists of markers that I want to associate with that gene, i.e. markers…
kreld
  • 732
  • 1
  • 5
  • 16
3
votes
0 answers

Is there a way to create multiple reports (one for each sample say) in a Snakemake worklow

I'm developing a snakemake pipeline to QC a large set of data. I'd like to generate a set of plots for each dataset and then generate a html report that combines the plots with some text. Looking at…
3
votes
0 answers

Snakemake & singularity, making different mounts available to each rule

I have been using singularity with some of my workflows and it works great so far. I have a question about binding directories. I can pass singularity arguments when running the snakemake workflow like: snakemake --use-singularity --singularity-args…
Jon Chung
  • 145
  • 5
3
votes
1 answer

Snakemake - rule that downloads data

I am having some trouble implementing a pipeline in which the first step is downloading the data from some server. As far as I understand, all rules must have inputs which are files. However, in my case the "input" is an ID string given to a script…
soungalo
  • 1,106
  • 2
  • 19
  • 34
3
votes
2 answers

Snakemake removes underscores in --config values?

I am trying to pass a --config argument called samples to the snakemake function. It seems no matter how I pass it all the underscores are removed? Any suggestions, or is there something I am doing wrong? snakemake -s snakefile.py all --configfile…
bb8
  • 190
  • 1
  • 10
3
votes
2 answers

Snakemake: Cluster multiple jobs together

I have a pretty simple snakemake pipeline that takes an input file does three subsequent steps to produce one output. Each individual job is very quick. Now I want to apply this pipeline to >10k files on an SGE cluster. Even if I use group to have…
Jonas
  • 1,639
  • 1
  • 18
  • 29
3
votes
1 answer

Snakemake always rebuilds targets, even when up to date

I'm new to snakemake and running into some behavior I don't understand. I have a set of fastq files with file names following the standard Illumina convention: SAMPLENAME_SAMPLENUMBER_LANE_READ_001.fastq.gz In a directory reads/raw_fastq. I'd like…
3
votes
1 answer

ChildIOException when trying to make directories for workflow [Snakemake]

I'm trying to make a simple way to create all of the sub-directories needed for the workflow in one rule. However, I'm getting a ChildIOException which makes no sense to me whenever I try to execute a rule that creates all of the required…
CelineDion
  • 906
  • 5
  • 21
3
votes
1 answer

Snakemake - dynamically derive the targets from input files

I have a large number of input files organized like this: data/ ├── set1/ │ ├── file1_R1.fq.gz │ ├── file1_R2.fq.gz │ ├── file2_R1.fq.gz │ ├── file2_R2.fq.gz | : │ └── fileX_R2.fq.gz ├── another_set/ │ ├── asdf1_R1.fq.gz │ ├──…
bgbrink
  • 643
  • 1
  • 6
  • 23