3

(I'm using Python 3.4.2) I have a script test.py, which handles SIGTERM etc. However, when it's called by some other script, the sig-handling wasn't correct.

This is test.py:

#! /path/to/python3
import time
import signal
import sys

def handleSIG(signal, frame):
    for i in range(10):
        print(i)
    sys.exit()

for sig in [signal.SIGTERM, signal.SIGINT, signal.SIGQUIT, signal.SIGHUP]:
    signal.signal(sig, handleSIG)

time.sleep(30)

If I just call "test.py" and do "Ctrl+C", then it prints 0,1,...,9 to the console. However, if I call test.py in another script using subprocess.call, only 0 will be printed. For example, here's another script that calls test.py:

import subprocess

cmd = '/path/to/test.py'
subprocess.call(cmd)

Strangely, using subproces.Popen() makes this error go away.

Erika L
  • 307
  • 4
  • 9
  • Popen would also not wait for the process to finish so not sure how that would work at all – Padraic Cunningham Dec 24 '15 at 23:12
  • @PadraicCunningham I ran test2.py from command line and use Ctrl-C to terminate it. For Popen I use "subprocess.Popen(cmd).wait()" so it waits for the cmd to finish. – Erika L Dec 24 '15 at 23:32
  • BTW, from the implementation of subprocess.call(), it makes sense that the signal-handling feature of the child process will be blocked. I'd understand if nothing was printed at all, but it's strange that it prints something (in my case, 0), but not all of them. – Erika L Dec 24 '15 at 23:37
  • What OS are you using? – Padraic Cunningham Dec 25 '15 at 00:00
  • linux (logged in via PuTTy under windows) – Erika L Dec 25 '15 at 00:20
  • I am **guessing** it sends `SIGKILL` to the child when the parent is closing in the case when parent is waiting for the child to finish (`subprocess.call()`). The reason it prints 0... well meybe it sent something before `SIGKILL` (you could print what), giving the child process a very short chance for cleanup... – zvone Dec 25 '15 at 00:32

2 Answers2

7

The python 3.3 subprocess.call implementation sends a SIGKILL to its child if its wait is interrupted, which it is by your Ctrl-C (SIGINT -> KeyboardInterrupt exception).

So, you see a race between the child process handling the terminal's SIGINT (sent to the whole process group) and the parent's SIGKILL.

From the python 3.3 sources, edited for brevity:

def call(*popenargs, timeout=None, **kwargs):
    with Popen(*popenargs, **kwargs) as p:
        try:
            return p.wait(timeout=timeout)
        except:
            p.kill()
            p.wait()
            raise

Contrast this with the python 2 implementation:

def call(*popenargs, **kwargs):
    return Popen(*popenargs, **kwargs).wait()

What an unpleasant surprise. It appears that this behavior was introduced in 3.3 when the wait and call interfaces were extended to accommodate a timeout. I don't find this correct, and I've filed a bug.

Stuart Berg
  • 17,026
  • 12
  • 67
  • 99
pilcrow
  • 56,591
  • 13
  • 94
  • 135
  • you could mention a workaround: define your own handler for `signal.SIGINT` in the parent (to avoid `KeyboardInterrupt` exception) or call `Popen().wait()` explicitly (if you don't need the timeout). – jfs Dec 25 '15 at 03:02
  • Yes. I think the OP is already aware of using `Popen` directly. As for a custom SIGINT handler, what would you have it do? Presumably the OP wants to use Ctrl-C to send a SIGINT to the entire process group and, importantly, interrupt the `wait`. – pilcrow Dec 25 '15 at 04:02
  • @pilcrow Thank you so much for your explanation. I've replaced `subprocess.call(cmd)` with `subprocess.Popen(cmd).wait()` and that solved this problem. And yes my goal is to use Ctrl-C to kill the parent process and make all child processes exit gracefully. – Erika L Dec 25 '15 at 16:05
  • @pilcrow: nothing (`signal(SIGINT, lambda *a: None)`), you don't need to interrupt `.wait()` if the child process exits on SIGINT (SIGINT is sent to the whole process group already). But if you can eliminate all `.call()` calls (including in 3rd-party code) then `Popen().wait()` might be preferable. – jfs Dec 25 '15 at 22:12
  • Hmm. Perhaps `os._exit()` would be better in the signal handler. That'd yield expected behavior — we probably want the parent to terminate, too. Either way a very particular workaround for an unpleasant situation. – pilcrow Dec 26 '15 at 03:35
  • `os._exit()` may prevent a cleanup code to run. It should not be used blindly. There is no indication that OP needs it in this case. – jfs Jan 08 '16 at 05:43
  • I don't think I was clear — the OP probably _does_ want the wait to be interrupted. But, yes, this is moving in to speculative territory. – pilcrow Jan 08 '16 at 05:45
2

UPDATE: This Python-3 regression will be fixed in Python 3.7, via PR #5026. For additional background and discussion, see bpo-25942 and (rejected) PR #4283.


I ran into this issue myself recently. The explanation given by @pilcrow is correct.

The OP's solution (in the comments) of merely using the Python 2 implementation (Popen(*popenargs, **kwargs).wait()) doesn't suffice for me, because I'm not 100% sure that the child will respond to SIGINT in all cases. I still want it to be killed -- after a timeout.

I settled on simply re-waiting for the child (with timeout).

def nice_call(*popenargs, timeout=None, **kwargs):
    """
    Like subprocess.call(), but give the child process time to
    clean up and communicate if a KeyboardInterrupt is raised.
    """
    with Popen(*popenargs, **kwargs) as p:
        try:
            return p.wait(timeout=timeout)
        except KeyboardInterrupt:
            if not timeout:
                timeout = 0.5
            # Wait again, now that the child has received SIGINT, too.
            p.wait(timeout=timeout)
            raise
        except:
            p.kill()
            p.wait()
            raise

Technically, this means that I'm potentially extending the life of the child beyond the original timeout, but that's better than incorrect cleanup behavior.

Stuart Berg
  • 17,026
  • 12
  • 67
  • 99