6

In Snakemake, I have 5 rules. For each I set the memory limit by resources mem_mb option. It looks like this:

rule assembly:
     input:
         file1 = os.path.join(MAIN_DIR, "1.txt"), \
         file2 = os.path.join(MAIN_DIR, "2.txt"), \
         file3 = os.path.join(MAIN_DIR, "3.txt")
     output:
         foldr = dir, \
         file4 = os.path.join(dir, "A.png"), \
         file5 = os.path.join(dir, "A.tsv")
     resources:
         mem_mb=100000
     shell:
         " pythonscript.py -i {input.file1} -v {input.file2} -q {input.file3} --cores 5 -o {output.foldr}  "

I want to limit the memory usage of the whole Snakefile by doing something like:

snakamake --snakefile mysnakefile_snakefile --resources mem_mb=100000

So not all jobs would use 100GB each ( if I have 5 rules, meaning as 500GB memory allocation), but all of their executions will be maximum 100GB ( 5 jobs, total of 100 GB allocation?)

abukaj
  • 2,582
  • 1
  • 22
  • 45
bapors
  • 887
  • 9
  • 26

1 Answers1

7

The command line argument sets the total limit. The Snakemake scheduler will ensure that for the set of running jobs, the sum of the mem_mb resources will not exceed the total limit.

I think this is exactly what you want, isn't it? You just need to set the per-job expected memory in the rule itself. Note that Snakemake does not measure this for you. You have to define that value yourself in the rule. E.g., if you expect your job to use 100MB memory, put mem_mb=100 into that rule.

Johannes Köster
  • 1,809
  • 6
  • 8
  • Thank you for your answer. Yes, I want to limit the RAM consumption of all jobs in total. I am avoiding putting separate limits to the jobs because one time it used 6 threads and used the memory allocation 6 times, therefore caused problems for me – bapors Feb 01 '18 at 08:36
  • At the moment, when I say 'mem_mb=100' it allocates '100mb' for each, at least it is how it looks like in dryrun – bapors Feb 01 '18 at 09:04
  • 1
    Yes, this is how it is meant to be. Resources are always per job over all threads of that job. If your memory usage depends on the number of threads (which is not always the case), you can define the resource as a callable, e.g., `mem_mb=lambda wildcards, threads: 100 * threads`. – Johannes Köster Feb 02 '18 at 13:52
  • So that each thread will use `100mb` with the default thread size? Or would it need me to specify it with `-j`command while executing snakemake? – bapors Feb 03 '18 at 19:38
  • I have run it with mem_mb=100000 in the commandline to target ~100GB in total usage, however it exceeded it and reached 110 GB in RAM usage.. Do you have any idea of why it did not stay in the limits? – bapors Feb 19 '18 at 13:11
  • @bapors Snakemake does not limit the ram usage of a job. It assumes that jobs play fair and stick to the declared limit. – abukaj Sep 02 '22 at 09:11