3

I am trying to pass a --config argument called samples to the snakemake function. It seems no matter how I pass it all the underscores are removed? Any suggestions, or is there something I am doing wrong?

snakemake -s snakefile.py all --configfile /share/biocore/keith/dennislab/snakemake/templates/tagseq.json --config samples=60_3_6,

or

snakemake -s snakefile.py all --configfile /share/biocore/keith/dennislab/snakemake/templates/tagseq.json --config samples="60_3_6"

or

snakemake -s snakefile.py all --configfile /share/biocore/keith/dennislab/snakemake/templates/tagseq.json --config samples='60_3_6'


All produce a result such as this for the config dictionary in the snakefile (notice the samples argument at the very end.

{'__default__': OrderedDict([('__comment1', 'running_locally=[True,False] ~ type=[PE,SE,tagseq]'), ('running_locally', 'False'), ('type', 'tagseq'), ('__comment2', 'Path to the file containing a column of sample names'), ('samples_file', '/share/biocore/keith/dennislab/rhesus_tagseq/samples.txt')]), 'project': OrderedDict([('basename', '/share/biocore/keith/dennislab'), ('id', 'PE'), ('fastqs', 'rhesus_tagseq'), ('human_rrna_ref', '/share/biocore/keith/workshop/rnaseq_examples/References/human_rrna.fasta'), ('star_ref', '/share/biocore/keith/dennislab/star.overlap100.gencode')]), 'hts_star': OrderedDict([('__comment', 'This is for running one sample for htspreproc > star'), ('job-name', 'hts_star_'), ('n', 1), ('ntasks', 9), ('partition', 'production'), ('time', '120'), ('mem', '32000'), ('__comment2', 'The name of the sample and analysis type will be inserted before .out and .err'), ('output', 'slurm_out/hts_star_.out'), ('error', 'slurm_out/hts_star_.err'), ('mail-type', 'NONE'), ('mail-user', 'kgmitchell@ucdavis.edu')]), 'samples': (6036,)}

snakecharmerb
  • 47,570
  • 11
  • 100
  • 153
bb8
  • 190
  • 1
  • 10

2 Answers2

3

Snakemake evaluates the value of a --config key/value pair by passing it as the input to each function in this list, in turn:

parsers = [int, float, eval, str]

When evaluating numeric literals in Python [u]nderscores are ignored for determining the numeric value of the literal.

Consequently, 60_3_6 gets evaluated as an integer, because int is tried before str:

>>> for p in parsers:
...     print(p('60_3_6'))
... 
6036
6036.0
6036
60_3_6

(In the first example in the question, 60_3_6, is passed as the value; in this case, eval returns a tuple containing 6036 as its only element, as shown in the dump of config values).

To get round this, you need to pass a value that will only be successfully processed by str.

Another possible workaround would be to pass a callable like

lambda : '60_3_6'

as snakemake will use a callable instead of the result of processing parsers however I don't see how this can be done from the config file or command line, I think you'd need to call snakemake's main function directly from python code (perhaps create a wrapper script?).

snakecharmerb
  • 47,570
  • 11
  • 100
  • 153
2

This "double quoting" seems to work:

snakemake --config samples='"10_13"'

it works also with more than one sample:

snakemake --config samples='"10_13", "foo", "19_19"'

Dummy Snakefile:

samples= config['samples']

rule all:
    input:
        expand('{sample}.txt', sample= samples),

rule one:
    output:
        '{sample}.txt',
    shell:
        r"""
        touch {output}
        """
dariober
  • 8,240
  • 3
  • 30
  • 47