1

I am following the tutorial in the documentation (https://snakemake.readthedocs.io/en/stable/tutorial/advanced.html) and have been stuck on the "Step 4: Rule parameter" exercise. I would like to access a float from my config file using a wildcard in my params directive.

I seem to be getting the same error whenever I run snakemake -np in the command line:

InputFunctionException in line 46 of /mnt/c/Users/Matt/Desktop/snakemake-tutorial/Snakefile:
Error:
  AttributeError: 'Wildcards' object has no attribute 'sample'
Wildcards:

Traceback:
  File "/mnt/c/Users/Matt/Desktop/snakemake-tutorial/Snakefile", line 14, in get_bcftools_call_priors

This is my code so far

import time
configfile: "config.yaml"

rule all:
    input:
        "plots/quals.svg"

def get_bwa_map_input_fastqs(wildcards):
    print(wildcards.__dict__, 1, time.time()) #I have this print as a check
    return config["samples"][wildcards.sample]

def get_bcftools_call_priors(wildcards):
    print(wildcards.__dict__, 2, time.time()) #I have this print as a check
    return config["prior_mutation_rates"][wildcards.sample]

rule bwa_map:
    input:
        "data/genome.fa",
        get_bwa_map_input_fastqs
        #lambda wildcards: config["samples"][wildcards.sample]
    output:
        "mapped_reads/{sample}.bam"
    params:
        rg=r"@RG\tID:{sample}\tSM:{sample}"
    threads: 2
    shell:
        "bwa mem -R '{params.rg}' -t {threads} {input} | samtools view -Sb - > {output}"

rule samtools_sort:
    input:
        "mapped_reads/{sample}.bam"
    output:
        "sorted_reads/{sample}.bam"
    shell:
        "samtools sort -T sorted_reads/{wildcards.sample} "
        "-O bam {input} > {output}"

rule samtools_index:
    input:
        "sorted_reads/{sample}.bam"
    output:
        "sorted_reads/{sample}.bam.bai"
    shell:
        "samtools index {input}"

rule bcftools_call:
    input:
        fa="data/genome.fa",
        bam=expand("sorted_reads/{sample}.bam", sample=config["samples"]),
        bai=expand("sorted_reads/{sample}.bam.bai", sample=config["samples"])
        #prior=get_bcftools_call_priors
    params:
        prior=get_bcftools_call_priors
    output:
        "calls/all.vcf"
    shell:
        "samtools mpileup -g -f {input.fa} {input.bam} | "
        "bcftools call -P {params.prior} -mv - > {output}"

rule plot_quals:
    input:
        "calls/all.vcf"
    output:
        "plots/quals.svg"
    script:
        "scripts/plot-quals.py"

and here is my config.yaml

samples:
  A: data/samples/A.fastq
  #B: data/samples/B.fastq
  #C: data/samples/C.fastq

prior_mutation_rates:
  A: 1.0e-4
  #B: 1.0e-6

I don't understand why my input function call in bcftools_call says that the wildcards object is empty of attributes, yet an almost identical function call in bwa_map has the attribute sample that I want. From the documentation it seems like the wildcards would be propogated before anything is run, so why is it missing?

This is the full output of the commandline call snakemake -np:

{'_names': {'sample': (0, None)}, '_allowed_overrides': ['index', 'sort'], 'index': functools.partial(<function Namedlist._used_attribute at 0x7f91b1a58f70>, _name='index'), 'sort': functools.partial(<function Namedlist._used_attribute at 0x7f91b1a58f70>, _name='sort'), 'sample': 'A'} 1 1628877061.8831172
Job stats:
job               count    min threads    max threads
--------------  -------  -------------  -------------
all                   1              1              1
bcftools_call         1              1              1
bwa_map               1              1              1
plot_quals            1              1              1
samtools_index        1              1              1
samtools_sort         1              1              1
total                 6              1              1


[Fri Aug 13 10:51:01 2021]
rule bwa_map:
    input: data/genome.fa, data/samples/A.fastq
    output: mapped_reads/A.bam
    jobid: 4
    wildcards: sample=A
    resources: tmpdir=/tmp

bwa mem -R '@RG\tID:A\tSM:A' -t 1 data/genome.fa data/samples/A.fastq | samtools view -Sb - > mapped_reads/A.bam

[Fri Aug 13 10:51:01 2021]
rule samtools_sort:
    input: mapped_reads/A.bam
    output: sorted_reads/A.bam
    jobid: 3
    wildcards: sample=A
    resources: tmpdir=/tmp

samtools sort -T sorted_reads/A -O bam mapped_reads/A.bam > sorted_reads/A.bam

[Fri Aug 13 10:51:01 2021]
rule samtools_index:
    input: sorted_reads/A.bam
    output: sorted_reads/A.bam.bai
    jobid: 5
    wildcards: sample=A
    resources: tmpdir=/tmp

samtools index sorted_reads/A.bam

[Fri Aug 13 10:51:01 2021]
rule bcftools_call:
    input: data/genome.fa, sorted_reads/A.bam, sorted_reads/A.bam.bai
    output: calls/all.vcf
    jobid: 2
    resources: tmpdir=/tmp

{'_names': {}, '_allowed_overrides': ['index', 'sort'], 'index': functools.partial(<function Namedlist._used_attribute at 0x7f91b1a58f70>, _name='index'), 'sort': functools.partial(<function Namedlist._used_attribute at 0x7f91b1a58f70>, _name='sort')} 2 1628877061.927639
InputFunctionException in line 46 of /mnt/c/Users/Matt/Desktop/snakemake-tutorial/Snakefile:
Error:
  AttributeError: 'Wildcards' object has no attribute 'sample'
Wildcards:

Traceback:
  File "/mnt/c/Users/Matt/Desktop/snakemake-tutorial/Snakefile", line 14, in get_bcftools_call_priors

If anyone knows what is going wrong I would really appreciate an explaination. Also if there is a better way of getting information out of the config.yaml into the different directives, I would gladly appreciate those tips.

Edit: I have searched around the internet quite a bit, but have yet to understand this issue.

  • I agree with the answer already given: wildcards are defined based on output file name patterns but `bcftools_call` doesn't have the `{sample}` pattern in its output. In case this may be useful, I have written some explanations about wildcards and other Snakemake mechanism here https://stackoverflow.com/a/50216057/1878788 and here https://bitbucket.org/blaiseli/snakemake/src/f11247997a378c48fe0f1dc4f921f0cb64e19a37/docs/snakefiles/understanding.rst?at=doc_contrib&fileviewer=file-view-default – bli Aug 18 '21 at 07:53

1 Answers1

2

Wildcards for each rule are based on that rule's output file(s). The rule bcftools_call has one output file (calls/all.vcf), which has no wildcards. Because of this, when get_bcftools_call_priors is called, it throws an exception when it tries to access the unset wildcards.sample attribute.

You should probably set a global prior_mutation_rate in your config file and then access that in the bcftools_call rule:

rule bcftools_call:
    ...
    params:
        prior=config["prior_mutation_rate"],
dofree
  • 401
  • 2
  • 4