0

This is a really long question due to the code snippets and the detailed explanations. TL;DR, are there issues with the macros shown below, is this a reasonable solution, and if not then what is the most reasonable way to solve the issues presented below?

I am currently writing a C library which deals with POSIX threads, and must be able to handle thread cancellation cleanly. In particular, the library functions may be called from threads that were set to be cancellable (either PTHREAD_CANCEL_DEFFERED or PTHREAD_CANCEL_ASYNCHRONOUS canceltype) by the user.

Currently the library functions that interface with the user all begin with a call to pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &oldstate), and at each return point, I make sure that a call to pthread_setcancelstate(oldstate, &dummy) is made to restore whatever cancellation settings the thread had previously.

This basically prevents the thread from being cancelled while in the library code, thus ensuring that the global state remains consistent and resources were properly managed before returning.

This method unfortunately has a few drawbacks:

  1. One must be sure to restore the cancelstate at every return point. This makes it somewhat hard to manage if the function has nontrivial control flow with multiple return points. Forgetting to do so may lead to threads that don't get cancelled even after return from the library.

  2. We only really need to prevent cancellations at points where resources are being allocated or global state is inconsistent. A library function may in turn call other internal library functions that are cancel-safe, and ideally cancellations could occur at such points.

Here is a sample illustration of the issues:

#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <pthread.h>

static void do_some_long_computation(char *buffer, size_t len)
{
    (void)buffer; (void)len;
    /* This is really, really long! */
}

int mylib_function(size_t len)
{
        char *buffer;
        int oldstate, oldstate2;

        pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &oldstate);

        buffer = malloc(len);

        if (buffer == NULL) {
                pthread_setcancelstate(oldstate, &oldstate2);
                return -1;
        }

        do_some_long_computation(buffer, len);

        fd = open("results.txt", O_WRONLY);

        if (fd < 0) {
                free(buffer);
                pthread_setcancelstate(oldstate, &oldstate2);
                return -1;
        }

        write(fd, buffer, len); /* Normally also do error-check */
        close(fd);

        free(buffer);

        pthread_setcancelstate(oldstate, &oldstate2);

        return 0;
}

Here it is not so bad because there are only 3 return points. One could possibly even restructure the control flow in such a way as to force all paths to reach a single return point, perhaps with the goto cleanup pattern. But the second issue is still left unresolved. And imagine having to do that for many library functions.

The second issue may be resolved by wrapping each resource allocation with calls to pthread_setcancelstate that will only disable cancellations during resource allocation. While cancellations are disabled, we also push a cleanup handler (with pthread_cleanup_push). One could also move all resource allocations together (opening the file before doing the long computation).

While solving the second issue, it is still somewhat hard to maintain because each resource allocation needs to be wrapped under these pthread_setcancelstate and pthread_cleanup_[push|pop] calls. Also it might not always be possible to put all resource allocations together, for instance if they depend on the results of the computation. Moreover, the control flow needs to be changed because one cannot return between a pthread_cleanup_push and pthread_cleanup_pop pair (which would be the case if malloc returns NULL for example).

In order to solve both issues, I came up with another possible method that involves dirty hacks with macros. The idea is to simulate something like a critical section block in other languages, to insert a block of code in a "cancel-safe" scope.

This is what the library code would look like (compile with -c -Wall -Wextra -pedantic):

#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <pthread.h>

#include "cancelsafe.h"

static void do_some_long_computation(char *buffer, size_t len)
{
    (void)buffer; (void)len;
    /* This is really, really long! */
}

static void free_wrapper(void *arg)
{
        free(*(void **)arg);
}

static void close_wrapper(void *arg)
{
        close(*(int *)arg);
}

int mylib_function(size_t len)
{
        char *buffer;
        int fd;
        int rc;

        rc = 0;
        CANCELSAFE_INIT();

        CANCELSAFE_PUSH(free_wrapper, buffer) {
                buffer = malloc(len);

                if (buffer == NULL) {
                        rc = -1;
                        CANCELSAFE_BREAK(buffer);
                }
        }

        do_some_long_computation(buffer, len);

        CANCELSAFE_PUSH(close_wrapper, fd) {
                fd = open("results.txt", O_WRONLY);

                if (fd < 0) {
                        rc = -1;
                        CANCELSAFE_BREAK(fd);
                }
        }

        write(fd, buffer, len);

        CANCELSAFE_POP(fd, 1); /* close fd */
        CANCELSAFE_POP(buffer, 1); /* free buffer */

        CANCELSAFE_END();

        return rc;
}

This resolves both issues to some extent. The cancelstate settings and cleanup push/pop calls are implicit in the macros, so the programmer only has to specify the sections of code that need to be cancel-safe and what cleanup handlers to push. The rest is done behind the scenes, and the compiler will make sure each CANCELSAFE_PUSH is paired with a CANCELSAFE_POP.

The implementation of the macros is as follows:

#define CANCELSAFE_INIT() \
        do {\
                int CANCELSAFE_global_stop = 0

#define CANCELSAFE_PUSH(cleanup, ident) \
                do {\
                        int CANCELSAFE_oldstate_##ident, CANCELSAFE_oldstate2_##ident;\
                        int CANCELSAFE_stop_##ident;\
                        \
                        if (CANCELSAFE_global_stop)\
                                break;\
                        \
                        pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &CANCELSAFE_oldstate_##ident);\
                        pthread_cleanup_push(cleanup, &ident);\
                        for (CANCELSAFE_stop_##ident = 0; CANCELSAFE_stop_##ident == 0 && CANCELSAFE_global_stop == 0; CANCELSAFE_stop_##ident = 1, pthread_setcancelstate(CANCELSAFE_oldstate_##ident, &CANCELSAFE_oldstate2_##ident))

#define CANCELSAFE_BREAK(ident) \
                                do {\
                                        CANCELSAFE_global_stop = 1;\
                                        pthread_setcancelstate(CANCELSAFE_oldstate_##ident, &CANCELSAFE_oldstate2_##ident);\
                                        goto CANCELSAFE_POP_LABEL_##ident;\
                                } while (0)

#define CANCELSAFE_POP(ident, execute) \
CANCELSAFE_POP_LABEL_##ident:\
                        pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &CANCELSAFE_oldstate_##ident);\
                        pthread_cleanup_pop(execute);\
                        pthread_setcancelstate(CANCELSAFE_oldstate_##ident, &CANCELSAFE_oldstate2_##ident);\
                } while (0)

#define CANCELSAFE_END() \
        } while (0)

This combines several macro tricks that I have encountered before.

The do { } while (0) pattern is used to have a multiline function-like macro (with semicolon required).

The CANCELSAFE_PUSH and CANCELSAFE_POP macros are forced to come in pairs by the use of the same trick as the pthread_cleanup_push and pthread_cleanup_pop using unmatched { and } braces respectively (here it is unmatched do { and } while (0) instead).

The usage of the for loops is somewhat inspired by this question. The idea is that we want to call the pthread_setcancelstate function after the macro body to restore cancellations after the CANCELSAFE_PUSH block. I use a stop flag that is set to 1 at the second loop iteration.

The ident is the name of the variable that will be released (this needs to be a valid identifier). The cleanup_wrappers will be given its address, which will always be valid in a cleanup handler scope according to this answer. This is done because the value of the variable is not yet initialized at the point of cleanup push (and also doesn't work if the variable is not of pointer type).

The ident is also used to avoid name collisions in the temporary variables and labels by appending it as a suffix with the ## concatenation macro, giving them unique names.

The CANCELSAFE_BREAK macro is used to jump out of the cancelsafe block and right into the corresponding CANCELSAFE_POP_LABEL. This is inspired by the goto cleanup pattern, as mentioned here. It also sets the global stop flag.

The global stop is used to avoid cases were there might be two PUSH/POP pairs in the same scope level. This seems like an unlikely situation, but if this happens then the content of the macros is basically skipped when the global stop flag is set to 1. The CANCELSAFE_INIT and CANCELSAFE_END macros aren't crucial, they just avoid the need to declare the global stop flag ourselves. These could be skipped if the programmer always does all the pushes and then all the pops consecutively.

After expanding the macros, we obtain the following code for the mylib_function:

int mylib_function(size_t len)
{
        char *buffer;
        int fd;
        int rc;

        rc = 0;
        do {
                int CANCELSAFE_global_stop = 0;

                do {
                        int CANCELSAFE_oldstate_buffer, CANCELSAFE_oldstate2_buffer;
                        int CANCELSAFE_stop_buffer;

                        if (CANCELSAFE_global_stop)
                                break;

                        pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &CANCELSAFE_oldstate_buffer);
                        pthread_cleanup_push(free_wrapper, &buffer);
                        for (CANCELSAFE_stop_buffer = 0; CANCELSAFE_stop_buffer == 0 && CANCELSAFE_global_stop == 0; CANCELSAFE_stop_buffer = 1, pthread_setcancelstate(CANCELSAFE_oldstate_buffer, &CANCELSAFE_oldstate2_buffer)) {
                                buffer = malloc(len);

                                if (buffer == NULL) {
                                        rc = -1;
                                        do {
                                                CANCELSAFE_global_stop = 1;
                                                pthread_setcancelstate(CANCELSAFE_oldstate_buffer, &CANCELSAFE_oldstate2_buffer);
                                                goto CANCELSAFE_POP_LABEL_buffer;
                                        } while (0);
                                }
                        }

                        do_some_long_computation(buffer, len);

                        do {
                                int CANCELSAFE_oldstate_fd, CANCELSAFE_oldstate2_fd;
                                int CANCELSAFE_stop_fd;

                                if (CANCELSAFE_global_stop)
                                        break;

                                pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &CANCELSAFE_oldstate_fd);
                                pthread_cleanup_push(close_wrapper, &fd);
                                for (CANCELSAFE_stop_fd = 0; CANCELSAFE_stop_fd == 0 && CANCELSAFE_global_stop == 0; CANCELSAFE_stop_fd = 1, pthread_setcancelstate(CANCELSAFE_oldstate_fd, &CANCELSTATE_oldstate2_fd)) {
                                        fd = open("results.txt", O_WRONLY);

                                        if (fd < 0) {
                                                rc = -1;
                                                do {
                                                        CANCELSAFE_global_stop = 1;
                                                        pthread_setcancelstate(CANCELSAFE_oldstate_fd, &CANCELSAFE_oldstate2_fd);
                                                        goto CANCELSAFE_POP_LABEL_fd;
                                                } while (0);
                                        }
                                }

                                write(fd, buffer, len);

CANCELSAFE_POP_LABEL_fd:
                                pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &CANCELSAFE_oldstate_fd);
                                pthread_cleanup_pop(1);
                                pthread_setcancelstate(CANCELSAFE_oldstate_fd, &CANCELSAFE_oldstate2_fd);
                        } while (0);

CANCELSAFE_POP_LABEL_buffer:
                        pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &CANCELSAFE_oldstate_buffer);
                        pthread_cleanup_pop(1);
                        pthread_setcancelstate(CANCELSAFE_oldstate_buffer, &CANCELSAFE_oldstate2_buffer);
                } while (0);
        } while (0);

        return rc;
}

Now, this set of macros is horrendous to look at and it is somewhat tricky to understand how they work exactly. On the other hand, this is a one-time task, and once written, they can be left and the rest of the project can benefit from their nice benefits.

I would like to know if there are any issues with the macros that I may have overlooked, and whether there could be a better way to implement similar functionality. Also, which of the solutions proposed do you think would be the most reasonable? Are there other ideas that could work better to resolve these issues (or perhaps, are they really non-issues)?

Tob Ernack
  • 241
  • 2
  • 12
  • Seems to better suite a code review, right? Or are you facing something which actually "*does not work*"? – alk Jul 29 '17 at 07:13
  • I am not certain which stackexchange site is most appropriate for this question. I would be glad to have it migrated wherever appropriate. – Tob Ernack Jul 29 '17 at 07:16
  • 1
    I do not personally like macros like this for many reasons. In this case. It is much safer to use the inline functions. A bit more writing - a lot less debugging :). – 0___________ Jul 29 '17 at 07:38
  • You don't say which operating systems you need to target, and which toolchains. Some offer additional facilities to deal with thread cancellation, perhaps offering a cleaner approach. – Florian Weimer Jul 29 '17 at 07:54
  • Since I am trying to make the code as portable as possible, I am targetting any system that can support pthreads and pthread_cancel (in particular Linux and OS X), and standard C compiler (I am using gcc, but would strongly prefer to avoid gcc-specific magic). – Tob Ernack Jul 29 '17 at 09:57

1 Answers1

0

Unless you use asynchronous cancellation (which is always very problematic), you do not have to disable cancellation around malloc and free (and many other POSIX functions). Synchronous cancellation only happens at cancellation points, and these functions are not.

You are abusing the POSIX cancellation handling facilities to implement a scope exit hook. In general, if you find yourself doing things like this in C, you should seriously consider using C++ instead. This will give you a much more polished version of the feature, with ample documentation, and programmers will already have experience with it.

Florian Weimer
  • 32,022
  • 3
  • 48
  • 92