pthread_kill() vs pthread_cancel() to terminate a thread blocked for I/O

Question

In our server code we are using poll() system call for monitoring client sockets. The poll() is called with a large timeout value. So the thread calling poll() gets blocked for I/O.

As per the flow, we have a scenario where we need to terminate thread that blocked in poll() from a different thread. I have came across pthread_kill() and pthread_cancel() functions, which can terminate the target thread blocked for I/O.

By reading the man pages, both these functions seems to work fine. Few links on internet suggested that both of these functions are dangerous to use.

Is there any alternative way to terminate the thread blocked for I/O ? If not, which of these functions is recommended to use.

Killing a thread is risky. See [this question](https://stackoverflow.com/questions/2163090/when-i-kill-a-pthread-in-c-do-destructors-of-objects-on-stacks-get-called). — François Andrieux, Sep 18 '18 at 17:23
Wake up from `poll` more often and test a termination flag set by the other thread. If the flag says terminate, terminate. Otherwise loop around and `poll` again until you hit the broader exit condition. Thread kill should be a last resort. as it can leave you with an unstable program — user4581301, Sep 18 '18 at 17:34
Since you have this tagged linux, one approach is to include an `eventfd()` descriptor in the ones you poll on, and signal it when the polling thread should exit (And have that thread do so gracefully when that fd becomes readable). — Shawn, Sep 18 '18 at 17:56
@FrançoisAndrieux: The linked answer (and question) makes no sense because there is no such thing as killing a thread. — R.. GitHub STOP HELPING ICE, Sep 18 '18 at 18:00
@R.. Killing a threat is generally understood to mean to cause it's execution to stop immediately and irrevocably. Whether or not a particular threading API allows it is another question. — François Andrieux, Sep 18 '18 at 18:05
@FrançoisAndrieux: POSIX threads doesn't, and the question is tagged `pthreads`. — R.. GitHub STOP HELPING ICE, Sep 18 '18 at 18:18
@R.. You're right about that. Still, killing a thread is risky and still, there *is* such a thing as killing a thread. This information is still relevant to OP since this is what they are attempting to do, regardless of the impossibility of doing it with their chosen API. — François Andrieux, Sep 18 '18 at 18:20

score 3 · Answer 1 · answered Sep 18 '18 at 19:28

3

An easy and clean option is to create a "signal" pipe. That is, call pipe, take the file descriptor for the "read" end and add it to your list of poll file descriptors (with POLLIN). Then, whenever you want to unblock the thread which is waiting in poll, just write a byte to the write end of the pipe. The pipe, having received data, will return as readable in the blocked thread. You can even specify different "commands" by varying the value of the byte written.

(You'll need to read the byte from the pipe before it can be re-used of course.)

answered Sep 18 '18 at 19:28

Gil Hamilton

11,973
28
51

2

+1 and a side note: Since OP is [already] doing sockets [and, thus, already familiar with them], we could replace the pipe with an `AF_UNIX` socket. It may provide better R/T response. Dunno. – Craig Estey Sep 18 '18 at 20:26
+1, This is a really good approach when it works, but there are some operations that block that don't have an associated file descriptor you can poll. – R.. GitHub STOP HELPING ICE Sep 19 '18 at 00:09

score 2 · Answer 2 · answered Sep 18 '18 at 17:36

2

Depending on the exact implementation of your thread library, it's very likely the thread won't even return from poll when being killed - so, you probably won't even achieve what you want.

You need to be very careful not to create memory leaks, and still are very likely to create a file descriptor leak by killing the thread that owns it (note, thread resources, in contrary to processes, aren't "cleaned up" by the system).

It is generally safer to use shorter timeout periods and poll a terminate flag in-between, or use signals to interrupt the system call, then terminate the thread under its own control, freeing all allocated resources.

answered Sep 18 '18 at 17:36

tofro

5,640
14
31

2

@R. I think it's pretty obvious what's meant. In case you don't like me call it killing, call it *forcibly terminate* – tofro Sep 18 '18 at 18:32
It's not a matter of what you call it. It's that pthreads explicitly has *no such operation* at all. – R.. GitHub STOP HELPING ICE Sep 18 '18 at 18:41
2

@R. just like processes - Still, the command is called `kill`. Even there, and has always been. – tofro Sep 18 '18 at 18:47
The `kill` function (and likewise the shell command) cannot kill threads. It sends a signal to a process which may (depending on whether it's ignored, caught, or blocked) result in termination of the process. Threads are not processes. – R.. GitHub STOP HELPING ICE Sep 18 '18 at 18:49
2

That's what I'm saying. And, still, it's called `kill`. Even if it can't kill processes. Even if it can be used to *wake up* processes which is sortof the opposite. I think you are nitpicking on terminology. – tofro Sep 18 '18 at 18:52
No, I am not nitpicking terminology. There is absolutely no interface which targets a thread and causes it to terminate involuntarily. The `kill` function/command does not do this. It can only target processses. – R.. GitHub STOP HELPING ICE Sep 18 '18 at 18:56
2

The posix thread functions that *terminate a thread* do this by sending a signal to it (just like `kill` does when sending a signal to a process). Most pthread libs do the same thing using non-terminating signals, so need co-operation from the thread in order to end the thread and not the whole process. So what? Still looks like `kill` to me. But arguing whether you can "kill" a thread or not doesn't really answer the question. – tofro Sep 18 '18 at 19:08
**There is no POSIX thread function which terminates a thread.** – R.. GitHub STOP HELPING ICE Sep 18 '18 at 21:30
2

@R.. `pthread_cancel` sends a [special] signal to the target thread, which intercepts it, does a bit of cleanup, and terminates. One can mask off signals via `pthread_sigmask` which is wrapper around `sigprocmask` [to prevent accidental masking of the signal that `pthread_cancel` uses] and it works on a per-_thread_ basis. `kill(2)` can be given a thread id (from `gettid`) instead of a process id (from `getpid`) – Craig Estey Sep 18 '18 at 23:17
@JeffGarrett: It sends a signal to a thread, resulting in one of (1) nothing, (2) execution of a signal handler in that thread, or (3) termination of the whole process. It can never result in termination of the target thread. – R.. GitHub STOP HELPING ICE Sep 19 '18 at 00:00
@CraigEstey: If you pass a tid to the `kill` function/syscall/command, it will deliver a signal to the process the thread belongs to, possibly resulting in termination of the process. It cannot cause termination of the thread whose tid was passed without causing termination of the entire process. You can test this if you don't believe me. There are **a lot of misconceptions** about this topic. – R.. GitHub STOP HELPING ICE Sep 19 '18 at 00:01
@CraigEstey: `pthread_sigmask` can be used to defer signals sent specifically to the thread (via `pthread_kill` and the underlying `tkill` or `tgkill` syscall, not the `kill` syscall) and to ensure that signals sent to the process rather than a specific thread get handled in a different thread, or to defer them if all threads have the signal masked. It has nothing to do with whether a thread can be killed (it can't). – R.. GitHub STOP HELPING ICE Sep 19 '18 at 00:05
@CraigEstey: `pthread_cancel` can be used to *cooperatively* request the cancellation of a thread. Cancellation will be acted upon only at functions which are cancellation points, and only if cancellation is enabled (it is by default, but usually isn't safe to use unless you've used `pthread_cleanup_push`, as mentioned in my answer). – R.. GitHub STOP HELPING ICE Sep 19 '18 at 00:08
2

@R.. `pthread_cancel` can force cancellation anytime. See manpage for `pthread_setcanceltype` [and `PTHREAD_CANCEL_ASYNCHRONOUS`]. An example: https://pastebin.com/jHs04Tjk It isn't necessarily wise to do this [without the callbacks]. In products I've done, I prefer an "escalation" strategy (e.g.): (1) request stop via flag in a task block, (2) Send thread a signal (with custom signal handler) that wakes up thread (`EINTR`) and thread looks at its stop flag, (3) `pthread_cancel`, (4) other stuff ... – Craig Estey Sep 19 '18 at 03:03
@CraigEstey: Enabling or disabling cancellation, or setting it to asynchronous (which can only be done for pure computation), can happen only from the thread itself; you cannot enable cancelability for another thread. This is what *cooperatively* means in my last comment above. – R.. GitHub STOP HELPING ICE Sep 19 '18 at 04:47

score 2 · Answer 3 · answered Sep 18 '18 at 18:16

There is no such thing as killing a thread.

The poorly-named pthread_kill function is a threads analogue of the poorly-named kill function, which sends a signal to process. The name kill historically made sense in that the default action of many signals is to kill the process. But this default action of killing the process does not depend on whether the signal was sent to the process or a particular thread - either way, the process terminates.

The only time pthread_kill is useful is when you want to invoke a signal handler on another thread. Unless you are certain that the signal handler could not have interrupted any function that is not async-signal-safe, the signal handler is limited to calling functions which are async-signal-safe, and thereby cannot even act to end the thread's lifetime (pthread_exit is not async-signal-safe).

If you're okay with the thread eventually terminating as a result of the call, pthread_cancel is the right way to end a thread stuck in a blocking operation. In order to use it safely, though, you need to make heavy use of pthread_cleanup_push and pthread_cleanup_pop.

If you don't want the thread to terminate, signals are your only option. You have two choices:

Install a signal handler (can be a no-op) using sigaction without SA_RESTART, so that it causes EINTR. Since there are inherent race conditions in this approach (if you send the signal just before the blocking syscall is entered, rather than once it's blocked, the signal won't do anything) you need to repeatedly send the signal, with exponential back-off so as not to starve the target of execution time, until the target confirms via some other synchronization mechanism (a POSIX semaphore works well) that it got the message.
Install a signal handler that will longjmp. In order to do this safely you need to control the context from which it can happen; the easiest way to do this is to keep it blocked in the signal mask normally, only unmasking it when the jmp_buf is valid around a blocking call. The blocking function you call needs to be async-signal-safe, and it needs to not be one which allocates or frees resources (like open or close) since you will lose knowledge of whether it completed when you handle the signal. Of course the jmp_buf, or a pointer to it, needs to be a thread-local object (_Thread_local/__thread) in order for this to work at all.

Someone's upset about me pointing out that there's no such thing as killing a thread (in the context of POSIX threads, for which the question is tagged). — R.. GitHub STOP HELPING ICE, Sep 18 '18 at 18:34

pthread_kill() vs pthread_cancel() to terminate a thread blocked for I/O

3 Answers3

Linked