The Snakemake's scheduler ignores my mem_mb
declaration and executes in parallel jobs which summed requirements exceed the available memory (e.g. three jobs with mem_mb=53000
in a 128 GB system). Moreover, it runs even jobs which declared requirements (over 1TB when I run snakemake -T10
) cannot be met in my system even if run serially. Snakemake also seems to keep a job running even if it allocates much more memory than it declared.
What I have in mind is to tell Snakemake to expect a job allocating up to certain amount of memory, plan the workflow accordingly and enforce the declared constraints on memory consumption. Is there any way to do so with Snakemake without resorting to serial execution?
I have a workflow with a lot of calls to a rule which may be either quite memory-light or memory-heavy. As I want to benefit from parallel execution, I have declared that the job requires 1000 MiB in first attempt to run, and more in subsequent attempts. The rule looks like this:
def get_mem_mb(wildcards, attempt):
return 1_000 if attempt == 1 else (51_000 + 1_000 * 2 ** (attempt - 1))
rule Rulename:
input:
"{CONFIG}.ini"
output:
"{CONFIG}.h5"
resources:
mem_mb=get_mem_mb
shell:
"python script.py -o {output} -i {input}"
This is not a duplicate of Snakemake memory limiting, as the only answer there is incomplete (it does not cover the "memory limiting" part). At the moment, it is my question which has the complete (but split) answer: