1

suppose I run a slurm job with the following configuration:

   #!/bin/bash

    #SBATCH --nodes=1                                # set the number of nodes
    #SBATCH --ntasks=1                               # Run a single task    
    #SBATCH --cpus-per-task=4                       # Number of CPU cores per task
    #SBATCH --time=26:59:00                          # set max wallclock time 
    #SBATCH --mem=16000M                             # set memory limit per node

    #SBATCH --job-name=myjobname                 # set name of job
    #SBATCH --mail-type=ALL                          # mail alert at start, end and abortion of execution
    #SBATCH --mail-user=sb@sw.com              # send mail to this address
    #SBATCH --output=/path/to/output/%x-%j.out  # set output path

    echo ' mem: ' $SLURM_MEM 
    echo '\n nodes: ' $SLURM_NODES
    echo '\n ntasks: ' $SLURM_NTASKS
    echo '\n cpus: ' $SLURM_CPUS_PER_TASK
    echo '\n time: ' $SLURM_TIME

I want to save the configuration of this job such as 'time, memory, number of tasks' so after the job finished I know under what configuration the job was executed.

So I decided to print these variables in output file, however there is nothing for time and memory in output:

\n nodes:
\n ntasks:  1
\n cpus:  1
\n time:

Does anyone knows a better way? or how to refer to time and memory?

Maryam Hnr
  • 145
  • 2
  • 7

1 Answers1

1

You can dump a lot of information about your job with scontrol show job <job_id>. This will give you among other memory requested. This will not however give you the actual memory usage. For that you will need to use sacct -l -j <job_id>.

So, at the end of your submission script, you can add

scontrol show job $SLURM_JOB_ID
sacct -l -j $SLURM_JOB_ID

There are many options for selecting the output od the sacct command, refer to the man page for the complete list.

damienfrancois
  • 52,978
  • 9
  • 96
  • 110
  • It seems it doesn't work for terminated jobs though? – Maryam Hnr Feb 20 '18 at 02:12
  • `scontrol` will keep information about jobs for a certain duration after they finished, that is determined by the `MinJobAge` configuration parameter. `sacct` will retrieve information about finished jobs as long as the `AccountingStorageType` configuration parameter is not `none`. – damienfrancois Feb 20 '18 at 07:18