5

I would like to let the slurm system send myprogram output via email when the computing is done. So I wrote the SBATCH as following

#!/bin/bash -l
#SBATCH -J MyModel
#SBATCH -n 1 # Number of cores
#SBATCH -t 1-00:00 # Runtime in D-HH:MM
#SBATCH -o JOB%j.out # File to which STDOUT will be written
#SBATCH -e JOB%j.err # File to which STDERR will be written
#SBATCH --mail-type=END
#SBATCH --mail-user=my@email.com
echo $SLURM_JOB_ID 
echo $SLURM_JOB_NAME 
/usr/bin/mpirun -np 1 ./myprogram
/usr/bin/mail -s $SLURM_JOB_NAME my@email.com < JOB${SLURM_JOB_ID}.out

The mail system reports

file .out not found

How can I construct the mail command to let the subject line be $SLURM_JOB_NAME and the mail contents from STDOUT file, e.g. JOB${SLURM_JOBID}.out in my case?

Feng
  • 101
  • 1
  • 7
  • try to wrap all the mail command in a echo and see what exactly you are executing. Do something like: `echo "/usr/bin/mail ..."` – Carles Fenoy Aug 21 '15 at 08:17
  • @CarlesFenoy It worked when I restarted the system... weird – Feng Aug 26 '15 at 06:57
  • Does this mean that each of your compute nodes are set up as open mail relays? – Jens Timmerman Mar 29 '18 at 11:56
  • 1
    @JensTimmerman Not really. We have an external smtp server. We use [ssmtp](https://packages.debian.org/unstable/ssmtp) to connect to that mail server and send mail. `/usr/bin/mail` is aliased to `/usr/sbin/ssmtp` once `ssmtp` is installed. – Feng Mar 31 '18 at 03:05
  • Does this answer your question? [How to configure the content of slurm notification emails?](https://stackoverflow.com/questions/53003230/how-to-configure-the-content-of-slurm-notification-emails) – tripleee Feb 06 '20 at 06:32

1 Answers1

1

Here is my solution:

#!/bin/bash
#SBATCH -J MyModel
#SBATCH -n 1 # Number of cores
#SBATCH -t 1-00:00 # Runtime in D-HH:MM
#SBATCH -o JOB%j.out # File to which STDOUT will be written
#SBATCH -e JOB%j.out # File to which STDERR will be written
#SBATCH --mail-type=BEGIN
#SBATCH --mail-user=my@email.com

echo "$(date "+%Y-%m-%d %H:%M:%S"): $SLURM_JOB_NAME start id=$SLURM_JOB_ID"

/usr/bin/mpirun -np 1 ./myprogram

cat JOB${SLURM_JOB_ID}.out | mail -s "$SLURM_JOB_NAME Ended id=$SLURM_JOB_ID"" my@email.com

and further we can add more info and keep exit code:

#!/bin/bash
#SBATCH -J MyModel
#SBATCH -n 1 # Number of cores
#SBATCH -t 1-00:00 # Runtime in D-HH:MM
#SBATCH -o JOB%j.out # File to which STDOUT will be written
#SBATCH -e JOB%j.out # File to which STDERR will be written
#SBATCH --mail-type=BEGIN
#SBATCH --mail-user=my@email.com

secs_to_human(){
    echo "$(( ${1} / 3600 )):$(( (${1} / 60) % 60 )):$(( ${1} % 60 ))"
}
start=$(date +%s)
echo "$(date -d @${start} "+%Y-%m-%d %H:%M:%S"): ${SLURM_JOB_NAME} start id=${SLURM_JOB_ID}\n"

### exec task here
( << replace with your task here >> ) \
&& (cat JOB$SLURM_JOB_ID.out |mail -s "$SLURM_JOB_NAME Ended after $(secs_to_human $(($(date +%s) - ${start}))) id=$SLURM_JOB_ID" my@email.com && echo mail sended) \
|| (cat JOB$SLURM_JOB_ID.out |mail -s "$SLURM_JOB_NAME Failed after $(secs_to_human $(($(date +%s) - ${start}))) id=$SLURM_JOB_ID" my@email.com && echo mail sended && exit $?)

you can also edit this to send seperate stdout/stderr logs or attach them as files.

This snippet is shared on github-gists

elucida
  • 96
  • 5