0

I'm trying to build a generic retry shell function to re-run the specified shell command a few times if it fails in the last time, here is my code:

retry() {
    declare -i number=$1
    declare -i interrupt=0
    trap "echo Exited!; interrupt=1;" SIGINT SIGTERM SIGQUIT SIGKILL

    shift

    for i in `seq $number`; do
      echo "\n-- Retry ${i}th time(s) --\n"

      $@

      if [[ $? -eq 0 || $interrupt -ne 0 ]]; then
        break;
      fi
    done
}

It works great for wget, curl and other all kinds of common commands. However, if I run

retry 10 rsync local remote

, send a ctrl+c to interrupt it during transferring progress, it reports

rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(700) [sender=3.1.3]

It seems that rsync suppresses the SIGINT and other some related signals inside, then returns a code 20 to the outside caller. This return code didn't make the loop break, then I send a few ctrl+c to interrupt the next rsync commands. It prints Exited! only for the last ctrl+c and trap catch it.

Questions:

  1. Why does it first return code 20 didn't make the loop break?

  2. How to let the trap catch the SIGINT signal but rsync, if not, what should I do?

halfer
  • 19,824
  • 17
  • 99
  • 186
Itachi
  • 5,777
  • 2
  • 37
  • 69
  • When rsync fails with 20, you already know that it "failed" anyway and it doesn't reallt affect your retry logic. So why does it matter? – P.P Feb 28 '20 at 18:57
  • I want to terminate the command and exit the loop immediately when I press ctrl+c, no matter what the sub-command is. @P.P – Itachi Feb 29 '20 at 06:51
  • Makes sense. I have posted a solution which should work as you wanted. – P.P Feb 29 '20 at 10:32

1 Answers1

1

Why does it first return code 20 didn't make the loop break?

You are correct that rsync does catch certain signals and exit with RERR_SIGNAL (20).

How to let the trap catch the SIGINT signal but rsync, if not, what should I do?

Since rsync has its own handlers, you can't do anything ( could use some hacks to override signal handlers within rsync with LD_PRELOAD for example. But it may be unnecessarily complicated). Since your traps are in the current shell, you wouldn't know whether the "command" was signaled or exit with non-zero.

I'd assume you want to your retry to be generic and you don't want special handling of rsync (e.g., a different command may exit with 75 on signals and you don't want to dealing with special cases).

The problem is your trap handlers isn't active as the the signal is received by the current process running process (rsync). You could instead run your command in the background and wait for it to complete. This would allow your catch signals from retry. On receiving a signal, it simply kills the child process.

#!/bin/bash

retry()
{
    declare -i number=$1
    declare -i i
    declare -i pid
    declare -i interrupted=0

    trap "echo Exiting...; interrupted=1" SIGINT SIGTERM SIGQUIT
    shift

    # Turn off "monitor mode" so the shell doesn't report terminating background jobs.
    set +m

    for ((i = 0; i < number; ++i)); do
        echo "\n-- Retry ${i}th time(s) --\n"
        $@ &
        pid=$!

        # If command succeeded, break
        wait $pid && break

        # If we receive one of the signals, break
        [[ $interrupted == 1 ]] && kill $pid && break
    done

    # Switch back to default behaviour
    set -m
    trap - SIGINT SIGTERM SIGQUIT
}

Note that SIGKILL can't be caught. So there's no point in setting a trap for it. So I have removed it.

Itachi
  • 5,777
  • 2
  • 37
  • 69
P.P
  • 117,907
  • 20
  • 175
  • 238
  • Thanks! I learned a lot here, and I merged the above two functions in the last editing for my usage. :) – Itachi Feb 29 '20 at 11:02
  • No problem. I personally prefer to use a separate function for signal handling. But there's no technical reason for that. – P.P Feb 29 '20 at 11:06