Why is the exit code always 0 inside handle_exit and how to distinguish error from success?

Question

I have a bash script where I want to do a pg_dumpall and upload it to S3 and then send an email to the admin if something went wrong with the exact error message and another email in case everything works fine.

#!/usr/bin/env bash

set -e
set -E
set -o pipefail
set -u
set -x

IFS=$'\n\t'

log="/tmp/error.txt"
exec 2>"$log"

handle_error() {    
    error_message="$(< "$log")"
    echo "$(caller): ${BASH_COMMAND}: ${error_message}"
    exit 1
}

handle_exit() {
    rm -rf ${backup_dirname}
    rm /tmp/error.txt
    echo "We are exiting $?"
}

trap "handle_exit" EXIT
trap "handle_error $?" ERR

backup_root="$HOME/Desktop/backups"
backup_dirname="$( date '+%Y_%m_%d_%HH_%MM_%SS' )"
backup_path="${backup_root}/${backup_dirname}"
encoding="UTF8"
globals_filename="globals.dump"
host="localhost"
port="5432"
username="abc"

mkdir -p "${backup_path}"
cd "${backup_root}"

pg_dumpall \
    --no-role-passwords \
    --no-password \
    --globals-only \
    --encoding="${encoding}" \
    --file="${backup_dirname}/${globals_filename}" \
    --host="${host}" \
    --port="${port}" \
    --username="${username}"

In my script above, when pg_dumpall fails for any reason, it calls handle_error and then it calls handle_exit here $? = 0

I can send an email from handle_error for the failure case but what about success?
handle_exit is called with $? = 0 on both conditions
Also what happens if my email sending code generates an error inside handle_error?
How do I distinguish between success and error states?
Is there a better way to get the error message without piping to /tmp/error.txt

This is the output of a run with error

+ IFS='
        '
+ log=/tmp/error.txt
+ exec
50 ./scripts/test-local-backup.sh: pg_dumpall --no-role-passwords --no-password --globals-only --encoding="${encoding}" --file="${backup_dirname}/${globals_filename}" --host="${host}" --port="${port}" --username="${username}": + trap handle_exit EXIT
+ trap 'handle_error 0' ERR
+ backup_root=/Users/vr/Desktop/backups
++ date +%Y_%m_%d_%HH_%MM_%SS
+ backup_dirname=2023_07_28_15H_58M_12S
+ backup_path=/Users/vr/Desktop/backups/2023_07_28_15H_58M_12S
+ encoding=UTF8
+ globals_filename=globals.dump
+ host=localhost
+ port=5432
+ username=abc
+ mkdir -p /Users/vr/Desktop/backups/2023_07_28_15H_58M_12S
+ cd /Users/vr/Desktop/backups
+ pg_dumpall --no-role-passwords --no-password --globals-only --encoding=UTF8 --file=2023_07_28_15H_58M_12S/globals.dump --host=localhost --port=5432 --username=abc
pg_dumpall: error: connection to server at "localhost" (::1), port 5432 failed: FATAL:  role "abc" does not exist
++ handle_error 0
We are exiting 0

and this is what a successful run looks like

+ IFS='
        '
+ log=/tmp/error.txt
+ exec
We are exiting 0

Paul Pazderski · Accepted Answer · 2023-07-28T11:22:29.227

handle_exit is called with $? = 0 on both conditions

Because that's the trap code you set.

trap "handle_error $?" ERR

It uses double quotes so the string is evaluated at the time you set the ERR trap with the exit code of the previous command (successfully setting the exit trap in your case) so the ERR trap code is handle_error 0. You should use a tool like https://www.shellcheck.net/ wish can recognize such errors.

Is there a better way to get the error message without piping to /tmp/error.txt

What do you dislike with the current solution? I would suggest to use mktemp instead of a hardcoded file but apart from that I kinda like it.

Also instead of

error_message="$(< "$log")"
echo "$(caller): ${BASH_COMMAND}: ${error_message}"

you can simply

echo -n "$(caller): ${BASH_COMMAND}: "
cat "$log"

I can send an email from handle_error for the failure case but what about success?

Why not just send success at the end of the script. Or send both, success and error, in the exit trap.

Also what happens if my email sending code generates an error inside handle_error?

As far as I remember error trap is disabled while executing the error trap but I haven't found the source yet.

Also you can always do stuff like command || true or

{
   commands
   which
   might
   fail
} || true

or simply do set +e.

To clarify how you should be able to pass exit codes around the traps here is a simplified example:

#!/bin/bash

trap 'handle_exit $?' EXIT
trap 'handle_err $?' ERR

handle_exit() {
  printf 'handle_exit: $? = %s   $1 = %s\n' $? $1
  exit $1
}

handle_err() {
  printf 'handle_err: $? = %s   $1 = %s\n' $? $1
  exit $(($1 + 1))
}

set -e
false

It would print

handle_err: $? = 1   $1 = 1
handle_exit: $? = 2   $1 = 2

and the overall exit status is also 2.

changed trap 'handle_exit' EXIT and trap 'handle_error $?' ERR to single quotes, exit code is still 0 at handle_exit despite it being called after handle_error — PirateApp, Jul 28 '23 at 11:09
Which exit code? If you `trap handle_exit $? EXIT` an `echo $1` in `handle_exit` should show the error. Your last `echo "We are exiting $?"` in `handle_exit` shows the status of the command before which was probably successful. — Paul Pazderski, Jul 28 '23 at 11:12
thank you, i finally got it i think, the check for $? inside handle_exit has to be the first line otherwise it ll be always 0 because the other commands completed successfully — PirateApp, Jul 28 '23 at 11:16

Why is the exit code always 0 inside handle_exit and how to distinguish error from success?

1 Answers1