3

I don't understand why my snakemake process stops at 50% and does not whant to proceed eventhough the DAG and the dry-run knows there are new things to be done.

I have a snakefile that:

  • Prints a wonderful DAG
  • Counts a correct number of jobs

Here is my snakefile:

import os
import os.path


"""
This rule prints user message on success (everything returned error code == 0)
"""
onsuccess:
    print("Thepipeline is over and successful.")


"""
This rule prints user message on error (something returned error code != 0)
"""
onerror:
    print("Thepipeline ended wrongly.")


"""
This rule lists the reports, here I expect only one.
"""
rule analysis_target:
    input:
        dynamic("Sample1/{cluster}/REPORT.html")
    message:
        "Finishing the pipeline"


"""
This rule performs normalisation of cel files
"""
rule analysis_process:
    input:
        "{sample}.CEL"
    output:
        os.path.join("{sample}", "{sample}_processed.RDS")
    message:
        "Perform normalisation of {wildcards.sample}"
    threads:
        1
    log:
        out = os.path.join("Process_{sample}.out"),
        err = os.path.join("Process_{sample}.err")
    shell:
        "Process.R {input}"  # Close function
        " > {log.out}"  # Stdout redirection
        " 2> {log.err}"  # Stderr redirection


"""
This rule performs L2R/BAF segmentation
"""
rule analysis_Segment_L2R:
    input:
        os.path.join("{sample}", "{sample}_processed.RDS")
    output:
        dynamic(os.sep.join(["{sample}", "{cluster}", "{sample}.ASPCF.RDS"]))
    message:
        "Working on L2R/BAF segmentation of {wildcards.sample}"
    threads:
        1
    log:
        out = os.path.join("Segment_{sample}.out"),
        err = os.path.join("Segment_{sample}.err")
    shell:
        "Segment.R {input}"  # Close function
        " > {log.out}"  # Stdout redirection
        " 2> {log.err}"  # Stderr redirection

"""
This rule performs the ASCN/TCN segmentation
"""
rule analysis_Segment_ASCN:
    input:
        dynamic(os.sep.join(["{sample}", "{cluster}", "{sample}.ASPCF.RDS"]))
    output:
        os.sep.join(["{sample}", "{cluster}", "ascat.ASCN.RDS"]),
    message:
        "Performing ASCN/TCN segmentation for {wildcards.sample}"
    threads:
        1
    log:
        out = os.path.join(config["log_directory"], "ASCN_{sample}.out"),
        err = os.path.join(config["log_directory"], "ASCN_{sample}.err")
    shell:
        "ASCN.R {input}"  # Close function
        " > {log.out}"  # Stdout redirection
        " 2> {log.err}"  # Stderr redirection

"""
This rune performs the HTML reporting required for biologists
"""
rule analysis_report:
    input:
        os.sep.join(["{sample}", "{cluster}", "ascat.ASCN.RDS"])
    output:
        os.sep.join(["{sample}", "{cluster}", "REPORT.html"])
    message:
        "Reporting analysis' results of {wildcards.sample}."
    threads:
        1
    log:
        out = os.path.join("report_{sample}.out"),
        err = os.path.join("report_{sample}.err")
    shell:
        "Report.R {input}"  # Close function
        " > {log.out}"  # Stdout redirection
        " 2> {log.err}"  # Stderr redirection

However, I have the following issues:

The process stops at 50%, printing the following :

[Wed Jun  6 16:08:06 2018] Building DAG of jobs...
[Wed Jun  6 16:08:06 2018] Using shell: /bin/bash
[Wed Jun  6 16:08:06 2018] Provided cores: 3
[Wed Jun  6 16:08:06 2018] Rules claiming more threads will be scaled down.
[Wed Jun  6 16:08:06 2018] Job counts:
[Wed Jun  6 16:08:06 2018]  count   jobs
[Wed Jun  6 16:08:06 2018]  1   analysis_Segment_ASCN
[Wed Jun  6 16:08:06 2018]  1   analysis_Segment_L2R
[Wed Jun  6 16:08:06 2018]  1   analysis_process
[Wed Jun  6 16:08:06 2018]  1   analysis_report
[Wed Jun  6 16:08:06 2018]  1   analysis_target
[Wed Jun  6 16:08:06 2018]  5

[Wed Jun  6 16:08:06 2018] Job 4: Perform normalisation of Sample1

[Wed Jun  6 16:10:03 2018] Finished job 4.
[Wed Jun  6 16:10:03 2018] 1 of 5 steps (20%) done

[Wed Jun  6 16:10:03 2018] Job 3: Working on L2R/BAF segmentation of Sample1

[Wed Jun  6 16:10:03 2018] Subsequent jobs will be added dynamically depending on the output of this rule
[Wed Jun  6 16:13:41 2018] Dynamically updating jobs
[Wed Jun  6 16:13:41 2018] Finished job 3.
[Wed Jun  6 16:13:41 2018] 2 of 4 steps (50%) done
[Wed Jun  6 16:13:41 2018] Complete log: /home/tdayris/ASCN/.snakemake/log/2018-06-06T160806.787514.snakemake.log
The pipeline is over and successful.

Snakemake completes (through a dryrun) the names of the dynamic files, yet do not produces them after failing at 50%

Here is the dry_run :

Building DAG of jobs...
Job counts:
    count   jobs
    1   analysis_Segment_ASCN
    1   analysis_report
    1   analysis_target
    3

Job 7: Performing ASCN/TCN segmentation for Sample1
Reason: Missing output files: Sample1/6452/ascat.ASCN.RDS

Job 6: Reporting analysis' results of Sample1.
Reason: Missing output files: Sample1/6452/Sample1.REPORT.html; Input files updated by another job: Sample1/6452/ascat.ASCN.RDS

Job 5: Finishing the pipeline
Reason: Input files updated by another job: Sample1/6452/REPORT.html

Job counts:
    count   jobs
    1   analysis_Segment_ASCN
    1   analysis_report
    1   analysis_target
    3

Here is the non-producing run :

[Wed Jun  6 16:23:23 2018] Building DAG of jobs...
[Wed Jun  6 16:23:23 2018] Using shell: /bin/bash
[Wed Jun  6 16:23:23 2018] Provided cores: 3
[Wed Jun  6 16:23:23 2018] Rules claiming more threads will be scaled down.
[Wed Jun  6 16:23:23 2018] Job counts:
[Wed Jun  6 16:23:23 2018]  count   jobs
[Wed Jun  6 16:23:23 2018]  1   analysis_Segment_ASCN
[Wed Jun  6 16:23:23 2018]  1   analysis_report
[Wed Jun  6 16:23:23 2018]  1   analysis_target
[Wed Jun  6 16:23:23 2018]  3
[Wed Jun  6 16:23:23 2018] Complete log: /home/tdayris/ASCN/.snakemake/log/2018-06-06T162323.286705.snakemake.log
The pipeline is over and successful.

Could you please highlight what I did wrong ?

tdayris
  • 31
  • 5

0 Answers0