1

Major EDIT: Having fixed a couple of issues thanks to comments and written a minimal reproducible example to help my helpers, I've narrowed down the issue to a difference between execution locally and using DRMAA.

Here is a minimal reproducible pipeline that does not require any external file download and can be executed out of the box or clone following git repository:

git clone git@github.com:kevinrue/snakemake-issue-all.git

When I run the pipeline using DRMAA I get the following error:

Building DAG of jobs...
Using shell: /bin/bash
Provided cluster nodes: 100
Singularity containers: ignored
Job counts:
    count   jobs
    1   all
    2   cat
    3
InputFunctionException in line 22 of /ifs/research-groups/sims/kevin/snakemake-issue-all/workflow/Snakefile:
SyntaxError: unexpected EOF while parsing (<string>, line 1)
Wildcards:
sample=A

However, if I run the pipeline locally (--cores 1), it works:

Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Singularity containers: ignored
Job counts:
    count   jobs
    1   all
    2   cat
    3

[Sat Jun 13 08:49:46 2020]
rule cat:
    input: data/A1, data/A2
    output: results/A/cat
    jobid: 1
    wildcards: sample=A

[Sat Jun 13 08:49:46 2020]
Finished job 1.
1 of 3 steps (33%) done

[Sat Jun 13 08:49:46 2020]
rule cat:
    input: data/B1, data/B2
    output: results/B/cat
    jobid: 2
    wildcards: sample=B

[Sat Jun 13 08:49:46 2020]
Finished job 2.
2 of 3 steps (67%) done

[Sat Jun 13 08:49:46 2020]
localrule all:
    input: results/A/cat, results/B/cat
    jobid: 0

[Sat Jun 13 08:49:46 2020]
Finished job 0.
3 of 3 steps (100%) done
Complete log: /ifs/research-groups/sims/kevin/snakemake-issue-all/.snakemake/log/2020-06-13T084945.632545.snakemake.log

My DRMAA profile is the following:

jobs: 100
default-resources: 'mem_free=4G'
drmaa: "-V -notify -p -10 -l mem_free={resources.mem_free} -pe dedicated {threads} -v MKL_NUM_THREADS={threads} -v OPENBLAS_NUM_THREADS={threads} -v OMP_NUM_THREADS={threads} -R y -q all.q"
drmaa-log-dir: /ifs/scratch/kevin
use-conda: true
conda-prefix: /ifs/home/kevin/devel/snakemake/envs
printshellcmds: true
reason: true

Briefly, the Snakefile looks like this

# The main entry point of your workflow.
# After configuring, running snakemake -n in a clone of this repository should successfully execute a dry-run of the workflow.


report: "report/workflow.rst"

# Allow users to fix the underlying OS via singularity.
singularity: "docker://continuumio/miniconda3"

include: "rules/common.smk"
include: "rules/other.smk"

rule all:
    input:
        # The first rule should define the default target files
        # Subsequent target rules can be specified below. They should start with all_*.
        expand("results/{sample}/cat", sample=samples['sample'])


rule cat:
    input:
        file1="data/{sample}1",
        file2="data/{sample}2"
    output:
        "results/{sample}/cat"
    shell:
        "cat {input.file1} {input.file2} > {output}"

Running snakemake -np gives me what I expect:

$ snakemake -np
          sample  condition
sample_id                  
A              A  untreated
B              B    treated
Building DAG of jobs...
Job counts:
    count   jobs
    1   all
    2   cat
    3

[Sat Jun 13 08:51:19 2020]
rule cat:
    input: data/B1, data/B2
    output: results/B/cat
    jobid: 2
    wildcards: sample=B

cat data/B1 data/B2 > results/B/cat

[Sat Jun 13 08:51:19 2020]
rule cat:
    input: data/A1, data/A2
    output: results/A/cat
    jobid: 1
    wildcards: sample=A

cat data/A1 data/A2 > results/A/cat

[Sat Jun 13 08:51:19 2020]
localrule all:
    input: results/A/cat, results/B/cat
    jobid: 0

Job counts:
    count   jobs
    1   all
    2   cat
    3
This was a dry-run (flag -n). The order of jobs does not reflect the order of execution.

I'm not sure how to debug it further. I'm happy to provide more information as needed.

Note: I use snakemake version 5.19.2

Thanks in advance!

EDIT Using the --verbose option, Snakemake seems to trip on the 'default-resources: 'mem_free=4G' and/or drmaa: "-l mem_free={resources.mem_free} that are defined in my 'drmaa' profile (see above).

$ snakemake --profile drmaa --verbose
Building DAG of jobs...
Using shell: /bin/bash
Provided cluster nodes: 100
Singularity containers: ignored
Job counts:
    count   jobs
    1   all
    2   cat
    3
Resources before job selection: {'_cores': 9223372036854775807, '_nodes': 100}
Ready jobs (2):
    cat
    cat
Full Traceback (most recent call last):
  File "/ifs/devel/kevin/miniconda3/envs/snakemake/lib/python3.8/site-packages/snakemake/rules.py", line 941, in apply
    res, _ = self.apply_input_function(
  File "/ifs/devel/kevin/miniconda3/envs/snakemake/lib/python3.8/site-packages/snakemake/rules.py", line 684, in apply_input_function
    raise e
  File "/ifs/devel/kevin/miniconda3/envs/snakemake/lib/python3.8/site-packages/snakemake/rules.py", line 678, in apply_input_function
    value = func(Wildcards(fromdict=wildcards), **_aux_params)
  File "/ifs/devel/kevin/miniconda3/envs/snakemake/lib/python3.8/site-packages/snakemake/resources.py", line 10, in callable
    value = eval(
  File "<string>", line 1
    4G
     ^
SyntaxError: unexpected EOF while parsing

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ifs/devel/kevin/miniconda3/envs/snakemake/lib/python3.8/site-packages/snakemake/__init__.py", line 626, in snakemake
    success = workflow.execute(
  File "/ifs/devel/kevin/miniconda3/envs/snakemake/lib/python3.8/site-packages/snakemake/workflow.py", line 951, in execute
    success = scheduler.schedule()
  File "/ifs/devel/kevin/miniconda3/envs/snakemake/lib/python3.8/site-packages/snakemake/scheduler.py", line 394, in schedule
    run = self.job_selector(needrun)
  File "/ifs/devel/kevin/miniconda3/envs/snakemake/lib/python3.8/site-packages/snakemake/scheduler.py", line 540, in job_selector
    a = list(map(self.job_weight, jobs))  # resource usage of jobs
  File "/ifs/devel/kevin/miniconda3/envs/snakemake/lib/python3.8/site-packages/snakemake/scheduler.py", line 613, in job_weight
    res = job.resources
  File "/ifs/devel/kevin/miniconda3/envs/snakemake/lib/python3.8/site-packages/snakemake/jobs.py", line 267, in resources
    self._resources = self.rule.expand_resources(
  File "/ifs/devel/kevin/miniconda3/envs/snakemake/lib/python3.8/site-packages/snakemake/rules.py", line 977, in expand_resources
    resources[name] = apply(name, res, threads=threads)
  File "/ifs/devel/kevin/miniconda3/envs/snakemake/lib/python3.8/site-packages/snakemake/rules.py", line 960, in apply
    raise InputFunctionException(e, rule=self, wildcards=wildcards)
snakemake.exceptions.InputFunctionException: SyntaxError: unexpected EOF while parsing (<string>, line 1)
Wildcards:
sample=B

InputFunctionException in line 20 of /ifs/research-groups/sims/kevin/snakemake-issue-all/workflow/Snakefile:
SyntaxError: unexpected EOF while parsing (<string>, line 1)
Wildcards:
sample=B
unlocking
removing lock
removing lock
removed all locks
Alex Guteniev
  • 12,039
  • 2
  • 34
  • 79
  • I cannot reproduce the issue, so please provide exact code and error. Anyway, the error message makes me think that there should be something like open parenteces, quotes or multiline commens that have no matching close counterpart. – Dmitry Kuzminov Jun 11 '20 at 18:21
  • try removing the comma after your last inputs in all rules of your Snakefile (the comma after definition of fastq2) – Eric C. Jun 11 '20 at 18:26
  • Dear Eric. Thank you for your suggestion, my new commit is [here](https://github.com/kevinrue/pipeline_alevin_citeseq/commit/21c886a32e3f03be684d60d83777d0c1167284b9) and I am running it now (as in, I am waiting for the results). Note that I have another pipeline that fails with the same error without any superfluous comma that I can see: https://github.com/kevinrue/snakemake_alevin_citeseq_10x – Kevin Rue-Albrecht Jun 12 '20 at 09:45
  • Dear @EricC unfortunately, removing the comma didn't avoid the error ``` InputFunctionException in line 10 of /ifs/research-groups/sims/kevin/snakemake_alevin_citeseq_1000/workflow/Snakefile: SyntaxError: unexpected EOF while parsing (, line 1) Wildcards: ``` – Kevin Rue-Albrecht Jun 12 '20 at 10:04
  • 1
    I can't see the syntax error. Nevertheless, I don't think your inputs are what you think they are. If you're using an `expand`, both your inputs will be lists. Moreover you're actually not using the wildcard `sample`. Shouldn't it be `fastq1="data/{sample}_R1.fastq.gz",fastq2="data/{sample}_R2.fastq.gz"` ? Try a dry run with `-p` to print shell commands to see if your shell is correct. – Eric C. Jun 12 '20 at 10:13
  • Thanks again Eric. I've edited the question to show the output of `snakemake -np`. Everything seems in order there, and all the rules succeed except for `all`. The error is still the same as I originally reported: `InputFunctionException ...`. I am using the wildcard sample in the `expand()` statement. I don't understand what you mean here. – Kevin Rue-Albrecht Jun 12 '20 at 13:53
  • @DmitryKuzminov, thanks for your reply too, I will put together a proper minimal reproducible example that does not rely on external files. – Kevin Rue-Albrecht Jun 13 '20 at 06:48
  • @EricC. having started writing a minimum example that can be run without downloading external files (https://github.com/kevinrue/snakemake-issue-all), I understand what you meant. `expand("data/{sample}1", sample=samples['sample'])` grabs all the samples for each instance of the rule. I'll look further into fixing my code and report back here. Thanks! – Kevin Rue-Albrecht Jun 13 '20 at 07:22
  • @KevinRue, according to your new details, you succeeded running snakemake with --dry-run. That means that Snakemake doesn't face any parsing errors. But your initial question was due to the parsing error. If you experience any issues right now, the first part of your questions becomes irrelevant. – Dmitry Kuzminov Jun 13 '20 at 07:23
  • Dear @DmitryKuzminov, the parsing (`-np`) was never a problem. That part always worked. The `SyntaxError: unexpected EOF while parsing (, line 1)` only ever occurred when I actually executed the pipeline. Now I've narrowed this issue down to executing the pipeline using --drmaa. It runs fine locally (i.e., with `--cores 1`) – Kevin Rue-Albrecht Jun 13 '20 at 13:07
  • You can run the pipeline with --verbose. This will among other stuff also print full stack traces, which should show you exactly the place where the syntax error occurs. – Johannes Köster Jun 15 '20 at 07:02
  • Thanks @JohannesKöster . Is there a way to also ask snakemake to print the `qsub` command itself? I am wondering whether my combination of `default-resources` and `drmaa` profiles might be turning into `-l mem_free=mem_free={resources.mem_free}`, for instance. I've got a few ideas to try thanks to your`--verbose` suggestion, while you answer further. – Kevin Rue-Albrecht Jun 15 '20 at 08:58
  • Currently not, sorry. We could make this available via verbose as well of course. PRs welcome. – Johannes Köster Jun 15 '20 at 09:08

1 Answers1

0

Thanks to @JohannesKöster I realised that my profile settings were wrong.

--default-resources [NAME=INT [NAME=INT ...]] indicates indicates that only integer values are supported, while I was providing string (i.e., mem_free=4G), naively hoping those would be supported as well.

I've updated the following settings in my profile, and successfully ran both snakemake --cores 1 and snakemake --profile drmaa.

default-resources: 'mem_free=4'
drmaa: "-V -notify -p -10 -l mem_free={resources.mem_free}G -pe dedicated {threads} -v MKL_NUM_THREADS={threads} -v OPENBLAS_NUM_THREADS={threads} -v OMP_NUM_THREADS={threads} -R y -q all.q"

Note the integer value 4 set as default resources, and how I moved the G to the drmaa: ... -l mem_free=...G setting.

Thanks a lot for the help everyone!