0

I'm running a bash script with multiple simultaneous commands (python scripts). I'm trying to kill all the processes if one of them has failed. The thing is that the python scripts are still running in the background, and if one of them has failed, my bash script doesn't know about.

Here's a snippet from my script:

set -a
trap cleanup_children SIGTERM
MY_PID=$$

function thread_listener () {
    to_execute="$1"
    echo "Executing $to_execute ..."
    $to_execute &
    PID=$!
    trap 'echo killing $PID; kill $PID' SIGTERM
    echo "Waiting for $PID ($to_execute) ..."
    wait $PID || if `kill -0 $MY_PID &> /dev/null`; then kill $MY_PID; fi
}

function cleanup_children () {
    for job in `jobs -p`
    do
        if `kill -0 $job &> /dev/null`; then
            echo "Killing child number $job"
            ps -p $job
            kill $job
        fi
    done
}

function create_app1 () {
    cd ${GIT_DIR}
    python ./create-app.py -myapp
    exit_code=$?
    echo "Create app1 ISO result: ${exit_code}"
    [ "${exit_code}" == "1" ] && exit 1
    mv ${ISO_OUTPUT_DIR}/rhel-7.1.iso ${ISO_OUTPUT_DIR}/${ISO_NAME}.iso
}

function create_app2 () {
    cd ${GIT_DIR}
    python ./create-app.py -do-something
    exit_code=$?
    echo "Create app1 ISO result: ${exit_code}"
    [ "${exit_code}" == "1" ] && exit 1
    mv ${ISO_OUTPUT_DIR}/rhel-7.1.iso ${ISO_OUTPUT_DIR}/${ISO_NAME}.iso
}

export -f create_app1
export -f create_app2

echo "MY_PID=$MY_PID"
thread_listener create_app1 &
PID_APP1=$!

thread_listener create_app2 &
PID_APP2=$!
wait

kill $PID_APP1 2> /dev/null
kill $PID_APP2 2> /dev/null
Alex Brodov
  • 3,365
  • 18
  • 43
  • 66

1 Answers1

0

Hm, this looks quite advanced ;). Do I assume correctly that you never see the "Create app1 ISO result" output then because the python script does not terminate? It might be an issue with the signal not being properly dispatched to bash background jobs. It might also be related to your python code not properly reacting to the signal. Have you checked out https://docs.python.org/2/library/signal.html? Sure you'd have to figure out the exact steps how to interrupt you python code while executing. I'd suggest to first make sure that the python code reacts to signals the way you want.

  • Actually I'm getting the exit code from my Python script, but when I'm actually exiting with 1 , it doesn't sending the TERM signal to the parent script (job) or at least it doesn't terminating the other Python script which is still running – Alex Brodov Jul 17 '16 at 19:23
  • Maybe if you posted the output of a complete run and pointed to the failing line would allow to shed some light onto this. I think it will be essential to understand the exact timing step by step to solve this. – Stefan Steinert Jul 17 '16 at 21:41
  • How would you implement the bash script to meet my goal (this is just a wrapper script) and I do get the exit code from the Python script , so clearly the problem is not there , may it has something to do with wrong signals catching and forwarding to the parent/child processes – Alex Brodov Jul 17 '16 at 21:48
  • Does it have to be bash? Otherwise I'd probably rewrite the wrapper itself in python. – Stefan Steinert Jul 18 '16 at 08:10
  • So what you want to achieve is like this: Run two jobs in parallel. If one of them fails then terminate the other. Otherwise let the jobs run until completion. Correct? – Stefan Steinert Jul 18 '16 at 09:45
  • Exactly. Please note the Python scripts are expecting some shell commands (mount , cp , etc.) – Alex Brodov Jul 18 '16 at 09:46
  • I thought about it for a bit and unfortunately I think this cannot be achieved using bash background jobs: If you don't know which job ends first you'll have to use two additional background "monitoring" jobs which wait() for your "real" jobs. But this is not possible because you cannot wait() for a process which is not a child process. I.e. the monitoring jobs cannot wait() for the PIDs of the real jobs. – Stefan Steinert Jul 18 '16 at 10:25