I'm trying to run snakemake on AWS using the --tibanna option. I did setup Tibanna following this tutorial, and my workflow runs well locally. However when I try to run it on the cloud, on my bucket 'tibanna-bucket' using the command snakemake --tibanna --default-remote-prefix=tibanna-bucket --use-conda
, I run into the following error:
(snakemake) ernestmordret@MacBook-Pro-de-Ernest workflow % snakemake --tibanna --default-remote-prefix=tibanna-bucket --use-conda --verbose
Building DAG of jobs...
sources=/Users/ernestmordret/Documents/GitHub/Snake-Activen/workflow/Snakefile
precommand=
bucket=tibanna-bucket
subdir=tibanna-bucket
Using shell: /bin/bash
Provided cloud nodes: 1
Job counts:
count jobs
1 all
1 fastp_pe
1 get_fastq_pe
1 kallisto_index
1 kallisto_quant
5
Resources before job selection: {'_cores': 9223372036854775807, '_nodes': 1}
Ready jobs (2):
kallisto_index
get_fastq_pe
Select jobs to execute...
Welcome to the CBC MILP Solver
Version: 2.10.5
Build Date: Oct 15 2020
command line - cbc /var/folders/xx/_135vjcd77l445nv441n8fw40000gn/T/89fbeeb095784985afbe9f4369722b4e-pulp.mps max ratio None allow None threads None presolve on strong None gomory on knapsack on probing on branch printingOptions all solution /var/folders/xx/_135vjcd77l445nv441n8fw40000gn/T/89fbeeb095784985afbe9f4369722b4e-pulp.sol (default strategy 1)
At line 2 NAME MODEL
At line 3 ROWS
At line 7 COLUMNS
At line 18 RHS
At line 21 BOUNDS
At line 24 ENDATA
Problem MODEL has 2 rows, 2 columns and 4 elements
Coin0008I MODEL read with 0 errors
String of None is illegal for double parameter ratioGap value remains 0
String of None is illegal for double parameter allowableGap value remains 0
String of None is illegal for integer parameter threads value remains 0
String of None is illegal for integer parameter strongBranching value remains 5
Option for gomoryCuts changed from ifmove to on
Option for knapsackCuts changed from ifmove to on
Continuous objective value is 12 - 0.00 seconds
Cgl0004I processed model has 0 rows, 0 columns (0 integer (0 of which binary)) and 0 elements
Cbc3007W No integer variables - nothing to do
Cuts at root node changed objective from -12 to -1.79769e+308
Probing was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Gomory was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Knapsack was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Clique was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
MixedIntegerRounding2 was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
FlowCover was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
TwoMirCuts was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
ZeroHalf was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Result - Optimal solution found
Objective value: 12.00000000
Enumerated nodes: 0
Total iterations: 0
Time (CPU seconds): 0.00
Time (Wallclock seconds): 0.00
Option for printingOptions changed from normal to all
Total time (CPU seconds): 0.00 (Wallclock seconds): 0.00
Selected jobs (1):
get_fastq_pe
Resources after job selection: {'_cores': 9223372036854775801, '_nodes': 0}
running job using Tibanna...
[Wed Jan 27 21:49:21 2021]
rule get_fastq_pe:
output: tibanna-bucket/data/SRR6663265_1.fastq, tibanna-bucket/data/SRR6663265_2.fastq
jobid: 3
wildcards: sample=SRR6663265
threads: 6
resources: mem_mb=1000, disk_mb=1000
job output tibanna-bucket/data/SRR6663265_1.fastq
job output is remote= true
is remote default= true
job output tibanna-bucket/data/SRR6663265_2.fastq
job output is remote= true
is remote default= true
additional tibanna config: None
Full Traceback (most recent call last):
File "/Users/ernestmordret/opt/anaconda3/envs/snakemake/lib/python3.9/site-packages/snakemake/__init__.py", line 691, in snakemake
success = workflow.execute(
File "/Users/ernestmordret/opt/anaconda3/envs/snakemake/lib/python3.9/site-packages/snakemake/workflow.py", line 1007, in execute
success = scheduler.schedule()
File "/Users/ernestmordret/opt/anaconda3/envs/snakemake/lib/python3.9/site-packages/snakemake/scheduler.py", line 488, in schedule
self.run(runjobs)
File "/Users/ernestmordret/opt/anaconda3/envs/snakemake/lib/python3.9/site-packages/snakemake/scheduler.py", line 499, in run
executor.run_jobs(
File "/Users/ernestmordret/opt/anaconda3/envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 136, in run_jobs
self.run(
File "/Users/ernestmordret/opt/anaconda3/envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 2133, in run
tibanna_input = self.make_tibanna_input(job)
File "/Users/ernestmordret/opt/anaconda3/envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 2108, in make_tibanna_input
tibanna_args = ec2_utils.Args(
File "/Users/ernestmordret/opt/anaconda3/envs/snakemake/lib/python3.9/site-packages/tibanna/ec2_utils.py", line 116, in __init__
self.fill_default()
File "/Users/ernestmordret/opt/anaconda3/envs/snakemake/lib/python3.9/site-packages/tibanna/ec2_utils.py", line 167, in fill_default
raise MissingFieldInInputJsonException(errmsg_template % ('snakemake_main_filename', self.language))
tibanna.exceptions.MissingFieldInInputJsonException: field snakemake_main_filename is required in args for language snakemake
Traceback (most recent call last):
File "/Users/ernestmordret/opt/anaconda3/envs/snakemake/lib/python3.9/site-packages/snakemake/__init__.py", line 691, in snakemake
success = workflow.execute(
File "/Users/ernestmordret/opt/anaconda3/envs/snakemake/lib/python3.9/site-packages/snakemake/workflow.py", line 1007, in execute
success = scheduler.schedule()
File "/Users/ernestmordret/opt/anaconda3/envs/snakemake/lib/python3.9/site-packages/snakemake/scheduler.py", line 488, in schedule
self.run(runjobs)
File "/Users/ernestmordret/opt/anaconda3/envs/snakemake/lib/python3.9/site-packages/snakemake/scheduler.py", line 499, in run
executor.run_jobs(
File "/Users/ernestmordret/opt/anaconda3/envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 136, in run_jobs
self.run(
File "/Users/ernestmordret/opt/anaconda3/envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 2133, in run
tibanna_input = self.make_tibanna_input(job)
File "/Users/ernestmordret/opt/anaconda3/envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 2108, in make_tibanna_input
tibanna_args = ec2_utils.Args(
File "/Users/ernestmordret/opt/anaconda3/envs/snakemake/lib/python3.9/site-packages/tibanna/ec2_utils.py", line 116, in __init__
self.fill_default()
File "/Users/ernestmordret/opt/anaconda3/envs/snakemake/lib/python3.9/site-packages/tibanna/ec2_utils.py", line 167, in fill_default
raise MissingFieldInInputJsonException(errmsg_template % ('snakemake_main_filename', self.language))
tibanna.exceptions.MissingFieldInInputJsonException: field snakemake_main_filename is required in args for language snakemake
unlocking
removing lock
removing lock
removed all locks
I'm confused: is the snakefile supposed to be provided locally, or on the s3 bucket? How exactly should I organize my s3 bucket to make it work?
here is the snakefile i'm using for this test:
rule all:
input:
"quant_results_SRR6663265"
rule get_fastq_pe:
output:
# the wildcard name must be accession, pointing to an SRA number
"data/{sample}_1.fastq",
"data/{sample}_2.fastq"
params:
# optional extra arguments
extra=""
threads: 6 # defaults to 6
wrapper:
"0.70.0/bio/sra-tools/fasterq-dump"
rule fastp_pe:
input:
sample=["data/{sample}_1.fastq", "data/{sample}_2.fastq"]
output:
trimmed=["trimmed/{sample}_1.fastq", "trimmed/{sample}_2.fastq"],
html="report/pe/{sample}.html",
json="report/pe/{sample}.json"
log:
"logs/fastp/pe/{sample}.log"
params:
adapters="--detect_adapter_for_pe",
extra=""
threads: 2
wrapper:
"0.70.0/bio/fastp"
rule kallisto_index:
input:
fasta = "index/{transcriptome}.fasta"
output:
index = "index/{transcriptome}.idx"
params:
extra = "--kmer-size=31 --bias"
log:
"logs/kallisto_index_{transcriptome}.log"
threads: 1
wrapper:
"0.70.0/bio/kallisto/index"
rule kallisto_quant:
input:
fastq = ["trimmed/{sample}_1.fastq","trimmed/{sample}_2.fastq"],
index = "index/Homo_sapiens.GRCh38.cdna.all.idx"
output:
directory('quant_results_{sample}')
params:
extra = ""
log:
"logs/kallisto_quant_{sample}.log"
threads: 4
wrapper:
"0.70.0/bio/kallisto/quant"
Thanks in advance!