I'm relatively new with Snakemake, and I'm having some troubles figuring out how to counts the number of jobs per rule. The snakefile I am using is below
rule test:
input:
files = expand("{file}", file=glob.glob("/home/MyData/input/*.csv"))
output:
out = expand("{file}", file=glob.glob("/home/MyData/output/*.csv"))
run:
with open(output.out, 'r') as input_stream:
for file in input_stream:
print(file)
The Jobs count
shows the following (when ran with snakemake -j 4 test -n
)
Job counts:
count jobs
1 test
1
However, going through a snakemake tutorial I found online (link here), his snakefile looks like this:
configfile: "config.yaml"
rule all:
input:
"plots/quals.svg",
"calls/all.vcf",
"mapped/",
"mapped/"
rule map_reads:
input:
"data/genome.fa",
"data/samples/{sample}.fastq"
output:
pipe("mapped/{sample}.bam")
conda:
"envs/mapping.yaml"
shell:
"bwa mem {input} | samtools view -Sb > {output}"
rule sort:
input:
"mapped/{sample}.bam"
output:
"mapped/{sample}.sorted.bam"
conda:
"envs/mapping.yaml"
shell:
"samtools sort -o {output} {input}"
rule call:
input:
genome="data/genome.fa",
bam=expand("mapped/{sample}.sorted.bam", sample=config["samples"])
output:
"calls/all.vcf"
conda:
"envs/calling.yaml"
shell:
"samtools mpileup -g -f {input.genome} {input.bam} | "
"bcftools call -mv - > {output}"
rule plot_qual:
input:
"calls/all.vcf"
output:
svg=report("plots/quals.svg", caption="report/plot-quals.rst")
conda:
"envs/stats.yaml"
script:
"scripts/plot-quals.py"
And the Job counts
looks like this (when run with snakemake -j 4 all -n
)
Job counts:
count jobs
1 all
1 call
3 map_reads
1 plot_qual
3 sort
9
With the config.yaml
file looking like:
samples:
- A
- B
- C
How can I get my Job counts
to show the number of input files run per rule?