3

I have the following simple python script:

import os, subprocess,signal,sys
import time

out = None
sub = None

def handler(signum,frame):
    print("script.py: cached sig: %i " % signum)
    sys.stdout.flush()

    if sub is not None and not sub.poll():
        print("render.py: sent signal to prman pid: ", sub.pid)
        sys.stdout.flush()
        sub.send_signal(signal.SIGTERM)
        sub.wait() # deadlocks....????
        #os.kill(sub.pid, signal.SIGTERM)  # this works
        #os.waitpid(sub.pid,0)             # this works

    for i in range(0,5):
        time.sleep(0.1)
        print("script.py: cleanup %i" % i)
        sys.stdout.flush()

    sys.exit(128+signum)

signal.signal(signal.SIGINT, handler)
signal.signal(signal.SIGUSR2, handler)
signal.signal(signal.SIGTERM, handler)

sub = subprocess.Popen(["./doStuff.sh"], stderr = subprocess.STDOUT)
sub.wait()


print("finished script.py")

doStuff.sh

#!/bin/bash

function trap_with_arg() {
    func="$1" ; shift
    for sig ; do
        trap "$func $sig" "$sig"
    done
}

pid=False

function signalHandler() {

    trap - SIGINT SIGTERM

    echo "doStuff.sh chached sig: $1"
    echo "doStuff.sh cleanup: wait 10s"
    sleep 10s

    # kill ourself to signal calling process we exited on SIGINT
    kill -s SIGINT $$

}

trap_with_arg signalHandler SIGINT SIGTERM
trap "echo 'doStuff.sh ignore SIGUSR2'" SIGUSR2 
# ignore SIGUSR2

echo "doStuff.sh : pid:  $$"
echo "doStuff.sh: some stub error" 1>&2
for i in {1..100}; do
    sleep 1s
    echo "doStuff.sh, rendering $i"
done

when I send the process launched in a terminal by python3 scripts.py & a signal with kill -USR2 -$! the script catches the SIGINT, and waits forever in sub.wait(), a ps -uf shows the following:.

user   27515  0.0  0.0  29892  8952 pts/22   S    21:56   0:00  \_ python script.py
user   27520  0.0  0.0      0     0 pts/22   Z    21:56   0:00      \_ [doStuff.sh] <defunct>

Be aware that doStuff.sh properly handles SIGINT and quits.

I would also like to get the output of stdout when the handler is called? How to do this properly?

Thanks a lot!

Gabriel
  • 8,990
  • 6
  • 57
  • 101
  • I can't reproduce the behavior (what is your OS, shell, python version?). Could you provide a dummy `dostuff.py` as an example? Why do you use `-$!` instead of `$!` -- the former may send the signal to the whole process group? – jfs Feb 19 '16 at 13:52
  • I send to the whole process group, because I run this on the cluster, which sends to the whole process group the SIGUSR2 signal. – Gabriel Feb 19 '16 at 20:59
  • I updated the answer, and provided doStuff.sh. Can you try this on your machine, on mine this deadlocks giving the process listing output as shown above – Gabriel Feb 19 '16 at 21:00
  • there is too much unrelated code. Here's a [minimal code example that shows that `send_signal()` works](https://gist.github.com/zed/215a57b3681cc5f77d2a) – jfs Feb 19 '16 at 21:12
  • I've updated [the minimal example](https://gist.github.com/zed/215a57b3681cc5f77d2a) to demonstrate that `child.wait()` hangs in the signal handler. The code in your question also hangs (for the same reason). – jfs Feb 19 '16 at 21:31

3 Answers3

1

Your code can't get the child process' stdout because it doesn't redirect its standard streams while calling subprocess.Popen(). It is too late to do anything about it in the signal handler.

If you want to capture stdout then pass stdout=subprocess.PIPE and call .communicate() instead of .wait():

child = subprocess.Popen(command, stdout=subprocess.PIPE)
output = child.communicate()[0]

There is a completely separate issue that the signal handler hangs on the .wait() call on Python 3 (Python 2 or os.waitpid() do not hang here but a wrong child's exit status is received instead). Here's a minimal code example to reproduce the issue:

#!/usr/bin/env python
import signal
import subprocess
import sys


def sighandler(*args):
    child.send_signal(signal.SIGINT)
    child.wait()  # It hangs on Python 3 due to child._waitpid_lock

signal.signal(signal.SIGUSR1, sighandler)
child = subprocess.Popen([sys.executable, 'child.py'])
sys.exit("From parent %d" % child.wait())  # return child's exit status

where child.py:

#!/usr/bin/env python
"""Called from parent.py"""
import sys
import time

try:
    while True:
        time.sleep(1)
except KeyboardInterrupt:  # handle SIGINT
    sys.exit('child exits on KeyboardInterrupt')

Example:

$ python3 parent.py &
$ kill -USR1 $!
child exits on KeyboardInterrupt
$ fg
... running    python3 parent.py

The example shows that the child has exited but the parent is still running. If you press Ctrl+C to interrupt it; the traceback shows that it hangs on with _self._waitpid_lock: statement inside the .wait() call. If self._waitpid_lock = threading.Lock() is replaced with self._waitpid_lock = threading.RLock() in subprocess.py then the effect is the same as using os.waitpid() -- it doesn't hang but the exit status is incorrect.

To avoid the issue, do not wait for child's status in the signal handler: call send_signal(), set a simple boolean flag and return from the hanlder instead. In the main code, check the flag after child.wait() (before print("finished script.py") in your code in the question), to see whether the signal has been received (if it is not clear from child.returncode). If the flag is set; call the appropriate cleanup code and exit.

jfs
  • 399,953
  • 195
  • 994
  • 1,670
0

You should look into subprocess.check_output

proc_output = subprocess.check_output(commands_list, stderr=subprocess.STDOUT)

you can surround it in a try except and then:

except subprocess.CalledProcessError, error:
    create_log = u"Creation Failed with return code {return_code}\n{proc_output}".format(
        return_code=error.returncode, proc_output=error.output
    )
Neil Twist
  • 1,099
  • 9
  • 12
  • ``try: out = subprocess.check_output(["command"]) except subprocess.CalledProcessError as error: print(error.output) )`` --> when does the exception get called when a signal arrives? I dont see the excpetion printed? – Gabriel Feb 17 '16 at 15:02
  • @Gabriel You would have to send the signal to the subprocess from your handler, then it will catch it. – Neil Twist Feb 17 '16 at 15:58
  • @ Neil, Thanks for the update. I tried exactly that, but ``sub.wait()`` stucks (see updated answer). Do you know how to do this? – Gabriel Feb 17 '16 at 16:36
  • @Gabriel do you get the output "send signal to command"? I would guess that you need to define sub as a global in the handler method `def handler(signum,frame):\nglobal sub` – Neil Twist Feb 18 '16 at 11:43
  • look at my solution to the problem, it is strange, thanks for the input, I corrected the question – Gabriel Feb 18 '16 at 12:37
0

I can only wait for the process by using

  os.kill(sub.pid, signal.SIGINT)
  os.waitpid(sub.pid,0)

instead of

  sub.send_signal(signal.SIGINT)
  sub.wait() # blocks forever

This has something to do with process groups on UNIX, which I dont really understand: I think the process ./doStuff.sh does not receive the signal because childs in the same process groups do not receive the signal. (I am not sure if this is correct). Hopefully somebody might elaborate on this issue a bit more.

The output till the handler gets called is pushed to the stdout of the calling bash (console).

Gabriel
  • 8,990
  • 6
  • 57
  • 101
  • there is no essential difference between the code examples. `.send_signal(sig)` uses `os.kill(self.pid, sig)` internally and `.wait()` uses `os.waitpid(self.pid, 0)` internally. It has nothing to do with process groups on Unix. – jfs Feb 19 '16 at 13:11
  • ok, so I dont understand it should hang there? maybe I should try a minimal example – Gabriel Feb 19 '16 at 20:43
  • my guess: it hangs because `sub.wait()` holds `sub._waitpid_lock` lock while the signal handler is running and therefore you shouldn't call `sub.wait()` inside the handler -- perhaps, it is a bug in Python (RLock should be used instead of Lock). You should [create a minimal code example that demonstrates the issue](http://stackoverflow.com/help/mcve) – jfs Feb 19 '16 at 20:48