-1

I'm writing a program where both the child process and the parent process can send a SIGTERM signal to the child.

The signal handler is something like this:

void custom_signal_handler(int signum, siginfo_t* info, void* ptr) {
    if (signum == SIGTERM) {
        printf("1\n");
    }
    else if (signum == SIGCONT) {
        printf("2\n");
    }
}

(I have simplified the printing in the ifs to keep the code here simpler).

For the SIGCONTsignal - only the parent can call this signal with kill(childPid, SIGCONT). When this is happening, the signal handler for the child prints the "2" as intended.

However, for the SIGTERM signal - both the parent can invoke it by sending kill(childPid, SIGTERM) and the child by calling raise(SIGTERM). The problem is that "1" is printed only when the child raises the SIGTERM signal, but not when the parent calls it.

I have regiestered the signal handler to the child:

// set up signal handler
struct sigaction custom_action;
memset(&custom_action, 0, sizeof(custom_action));
custom_action.sa_sigaction = custom_signal_handler;
custom_action.sa_flags = SA_SIGINFO;
// assign signal handlers
    if (0 != sigaction(SIGCONT, &custom_action, NULL)) {
        printf("Signal registration failed: %s\n",strerror(errno));
        return -1;
    }
    if (0 != sigaction(SIGTERM, &custom_action, NULL)) {
        printf("Signal registration failed: %s\n",strerror(errno));
        return -1;
    }

Any ideas? Thanks!

Mickey
  • 1,405
  • 2
  • 13
  • 33
  • How is `custom_action` initialized? Can you please try to create a [Minimal, Complete, and Verifiable Example](http://stackoverflow.com/help/mcve) and show us? And if you do `raise(SIGTERM)` do *anything* happen? Have you checked (possibly using a debugger) that the signal handler isn't called at all? And have you checked what [`raise`](http://man7.org/linux/man-pages/man3/raise.3.html) returns? – Some programmer dude Mar 26 '18 at 08:50
  • 1
    On such cases, you should care about flushing output. – Giacomo Catenazzi Mar 26 '18 at 08:52
  • @Someprogrammerdude I have edited the question to include the sigaction initialization. For the debugging, I have checked that the `kill(childPid,SIGTERM)` is being called. The `raise(SIGTERM)` works (I have written that in the question). Thanks! – Mickey Mar 26 '18 at 08:54
  • 4
    Oh, and remember that `printf` isn't a [signal-safe](http://man7.org/linux/man-pages/man7/signal-safety.7.html) function. No C stdio function really is. – Some programmer dude Mar 26 '18 at 08:54
  • `printf` isn't async-signal-safe and it would also buffer the output. If you replace printf calls with `write` (such as `write(STDOUT_FILENO, "1", 1);`), do you still see the same behaviour? – P.P Mar 26 '18 at 08:55
  • @GiacomoCatenazzi I didn't originally write it here but the `printf` is being called with `\n` at the end. Doesn't that suppose to flush printing in C? (added it to the question to keep it clarified) – Mickey Mar 26 '18 at 08:55
  • If a newline flushes the output or not depend in what `stdout` is connected to. If it's a terminal then yes `stdout` is line-buffered. If it's a pipe or redirected to a file then no it's not, and newline will not flush. – Some programmer dude Mar 26 '18 at 08:57
  • @Someprogrammerdude In my case `stdout` is connected to the terminal. I will check using the `write` function, but the problem is that the string I need to print to the terminal should be formatted. Using `sprintf` to format the string before won't result in the same problem? – Mickey Mar 26 '18 at 09:00
  • "I have regiestered the signal handler to the child" - does it mean you have installed it only in the child process? If so, how can the parent know about it? Post a [MCVE]. – P.P Mar 26 '18 at 09:00
  • Using `sprintf` followed by `write` should be safe. – Some programmer dude Mar 26 '18 at 09:06
  • @P.P. This might be something that I have yet not understood about signal handlers. The signal handler I registered is created only for the child (and is registered in the child). However, the `SIGCONT` that is being sent by the parent (which is also not registered in the parent) invokes the `printf("2\n|")` as I intended. As for the parent's code, most of it is irrelevant so I have excluded it (but mentioned the calls to `kill(childPid,SIGTERM)`) – Mickey Mar 26 '18 at 09:09
  • @P.P. The parent doesn't need to know about if the child (or any other) process have installed a signal handler. It only needs to send the signal to the process. – Some programmer dude Mar 26 '18 at 09:11
  • @Someprogrammerdude The question is why parent process doesn't appear to receive SIGTERM. So it matters that the parent process also has the SIGTERM handler. – P.P Mar 26 '18 at 09:13
  • @Mickey The sender, parent process, doesn't need to have any signal handlers registered if it just sends via `kill()`. Only the signal receiver, the child process, needs to have it registered. But if you expect the parent process to be able to handle a SIGTERM, then it has to the SIGTERM handler registered. – P.P Mar 26 '18 at 09:13
  • @P.P. That's not how I read it. As I read it, the parent calls `kill` to send a signal to the child process, it doesn't `raise` (or `kill`) to send it to itself. – Some programmer dude Mar 26 '18 at 09:15
  • @Someprogrammerdude Tried replacing the `printf` with `sprintf+write` but the problem persists. @P.P. As for the parent, I don't need any special signal handling for it. Only that the child will print something when it is being terminated. – Mickey Mar 26 '18 at 09:15
  • @Someprogrammerdude I believe I might have found the reason for the problem, but I'm not sure how to surpass it. I am sending the `SIGTERM` from the parent while the relevant child is at "`raise(SIGSTOP)`". I think that because the child is in `SIGSTOP` it doesn't run the signal handler. However, I do need to send the `SIGTERM` only when the child is in `SIGSTOP`, without sending `SIGCONT` before. Is it possible? Is there a workaround? thanks! – Mickey Mar 26 '18 at 09:32
  • 1
    @Someprogrammerdude *Using `sprintf` followed by write should be safe* "May be safe" might be more accurate. The [Linux man page](http://man7.org/linux/man-pages/man3/printf.3.html) makes no mention of `sprintf()` being async-signal-safe. The [Solaris man page](https://docs.oracle.com/cd/E36784_01/html/E36874/sprintf-3c.html), for example, explicitly states "The `sprintf()` and `snprintf()` functions are Async-Signal-Safe.". Since the strings are fixed, none of the `*printf()` functions are needed. `write( 2, "1\n", strlen( "1\n" ) );` is sufficient. – Andrew Henle Mar 26 '18 at 09:32
  • I think it's time you try to explain what you're trying to do, and what is the actual problem you want to solve. Why do you want to send `SIGTERM` to the child when it's stopped? This is [an XY problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). – Some programmer dude Mar 26 '18 at 09:34
  • Post a complete working example that demonstrates the behaviour you observe. – Maxim Egorushkin Mar 26 '18 at 11:12

1 Answers1

1

In a comment to the question, OP states

I am sending the SIGTERM from the parent while the relevant child is at "raise(SIGSTOP)". I think that because the child is in SIGSTOP it doesn't run the signal handler.

Correct. When a process is stopped, it does not receive signals other than SIGCONT and SIGKILL (plus SIGSTOP, SIGTSTP, SIGTTIN, SIGTTOU are ignored). All other signals should become pending, delivered when the process is continued. (Standard POSIX signals are not queued, though, so you can rely on only one standard POSIX signal becoming pending.)

However, I do need to send the SIGTERM only when the child is in SIGSTOP, without sending SIGCONT before.

The target process will receive SIGTERM only after it is continued. That is how stopped processes behave.

Is there a workaround?

Perhaps; it depends on the requirements. But do note that your intended use case involves behaviour that does not comply with POSIX (i.e., you want a stopped process to react to something other than just being continued or killed outright); and that is the direct reason for the problems you have encountered.

The simplest is to use a variant of SIGCONT instead of SIGTERM, to control the terminating of the process; for example, via sigqueue(), providing a payload identifier that tells the SIGCONT signal handler to treat it as a SIGTERM signal instead (and thus distinguishing between normal SIGCONT signals, and those that are stand-ins for SIGTERM).

A more complicated one is to have the process fork a special monitoring child process, that regularly sends special "check for pending SIGTERM signals" SIGCONT signals, and dies when the parent dies. The child process can be connected to the parent via a pipe (parent having the write end, child the read end), so that when the parent dies, a read() on the child end returns 0, and the child can exit too. The parent process SIGCONT handler just needs to detect if the signal was sent by the child process — the si_pid field of the siginfo_t structure should only match the child process ID if sent by the child —, and if so, check if a SIGTERM is pending, handle it if yes; otherwise just raise SIGSTOP. This approach is very fragile, due to the many possibilities of race windows — especially raising SIGSTOP just after receiving SIGCONT. (Blocking SIGCONT in the signal handler is essential. Also, the monitoring child process should probably be in a separate process group, not attached to any terminal, to avoid being stopped by a SIGSTOP targeted at the entire process group.)


Note that one should only use async-safe functions in signal handlers, and retain errno unchanged, to keep everything working as expected.

For printing messages to standard error, I often use

#include <stdlib.h>
#include <unistd.h>
#include <errno.h>

static int wrerr(const char *msg)
{
    const int   saved_errno = errno;
    const char *end = msg;
    ssize_t     count;
    int         retval = 0;

    /* Find end of string. strlen() is not async-signal safe. */
    if (end)
        while (*end)
            end++;

    while (msg < end) {
        count = write(STDERR_FILENO, msg, (size_t)(end - msg));
        if (count > 0)
            msg += count;
        else
        if (count != -1) {
            retval = EIO;
            break;
        } else
        if (errno != EINTR) {
            retval = errno;
            break;
        }
    }

    errno = saved_errno;
    return retval;
}

which not only is async-signal safe, but also keeps errno unchanged. It returns 0 if success, and an errno error code otherwise.

If we expand the prints a bit for clarity, OP's custom signal handler becomes for example

void custom_signal_handler(int signum, siginfo_t* info, void* context) {
    if (signum == SIGTERM) {
        wrerr("custom_signal_handler(): SIGTERM\n");
    } else
    if (signum == SIGCONT) {
        wrerr("custom_signal_handler(): SIGCONT\n");
    }
}

Do note that when this is used, ones program should not use stderr (from <stdio.h>) at all, to avoid confusion.

Nominal Animal
  • 38,216
  • 5
  • 59
  • 86
  • *strlen() is not async-signal safe* True, but I LOL'd at that. You'd have to *deliberately* do something crazy in the implementation to make `strlen()` async-signal-unsafe. It's even [specified as being async-signal-safe on some OSes](https://docs.oracle.com/cd/E86824_01/html/E54766/strchrnul-3c.html) – Andrew Henle Mar 26 '18 at 15:55
  • @AndrewHenle: Not really, definitely not LOL material in my opinion. Consider, for example, the string direction flag on x86 and x86-64 (that affects whether `stos` and `movs` instructions auto-increment or auto-decrement the address). On a roughly similar architecture, where the direction is not part of standard flags, the ABI could say that ordinary functions are not required to retain the direction, but async-signal safe functions are. It would make sense to avoid the extra cost (of retaining the direction) in a general-purpose `strlen()`, making it non-async-signal safe. – Nominal Animal Mar 26 '18 at 17:44
  • strlen finally was deemed async safe in Issue 7. http://austin-group-l.opengroup.narkive.com/jBp07fPN/adding-simple-string-functions-to-async-signal-safe-list contains some of the discussion from 2013. – Mark Plotnick Mar 26 '18 at 17:44
  • @MarkPlotnick: Did they consider any other architectures besides x86 there? – Nominal Animal Mar 26 '18 at 17:46
  • I don't know. But I doubt it was blindly mandated. They likely talked to compiler writers to ensure that complying with the new update of the standard would be feasible on a variety of architectures. – Mark Plotnick Mar 26 '18 at 17:50
  • @MarkPlotnick: Compiler developers do surprisingly idiotic things just because the standards say they can; they are very rarely interested in what is practical or makes sense in practice. So they're not an authority in this regard, IMO. (That discussion, by the way, mentions an old Linux bug where the direction flag state in a signal handler was leaked from the kernel.) Until all of `strlen()`, `memcpy()`, and `memmove()` are async-signal safe, I will not be using any of them in signal handlers. – Nominal Animal Mar 26 '18 at 17:55
  • 1
    (One of my favourite grumbles is how `malloc()` is not required to return sufficiently aligned memory for vector types, because vector types are not standard types. It's similar to Firefox and Chrome developers, who decided that you shall not use UTF-8 as the default character set, because the default character set must be a legacy character set, and UTF-8 is the only non-legacy character set. Some humans can be completely dysfunctional idiots, while being perceived as very bright and productive, you see.) – Nominal Animal Mar 26 '18 at 17:59