4

I apologise in advance for what will be a bit of a code dump, I've trimmed as much unimportant code as possible:

// Global vars / mutex stuff
extern char **environ;
pthread_mutex_t env_mutex = PTHREAD_MUTEX_INITIALIZER;

int
putenv_r(char *string)
{
    int len;
    int key_len = 0;
    int i;

    sigset_t block;
    sigset_t old;

    sigfillset(&block);
    pthread_sigmask(SIG_BLOCK, &block, &old);

    // This function is thread-safe
    len = strlen(string);

    for (int i=0; i < len; i++) {
        if (string[i] == '=') {
            key_len = i; // Thanks Klas for pointing this out.
            break;
        }
    }
    // Need a string like key=value
    if (key_len == 0) {
        errno = EINVAL; // putenv doesn't normally return this err code
        return -1;
    }

    // We're moving into environ territory so start locking stuff up.
    pthread_mutex_lock(&env_mutex);

    for (i = 0; environ[i] != NULL; i++) {
        if (strncmp(string, environ[i], key_len) == 0) {
            // Pointer assignment, so if string changes so does the env. 
            // This behaviour is POSIX conformant, instead of making a copy.
            environ[i] = string;
            pthread_mutex_unlock(&env_mutex);
            return(0);
        }
    }

    // If we get here, the env var didn't already exist, so we add it.
    // Note that malloc isn't async-signal safe. This is why we block signals.
    environ[i] = malloc(sizeof(char *));
    environ[i] = string;
    environ[i+1] = NULL;
    // This ^ is possibly incorrect, do I need to grow environ some how?

    pthread_mutex_unlock(&env_mutex);
    pthread_sigmask(SIG_SETMASK, &old, NULL);

    return(0);
}

As the title says, I'm trying to code a thread safe, async-signal safe reentrant version of putenv. The code works in that it sets the environment variable like putenv would, but I do have a few concerns:

  1. My method for making it async-signal safe feels a bit ham-handed, just blocking all signals (except SIGKILL/SIGSTOP of course). Or is this the most appropriate way to go about it.
  2. Is the location of my signal blocking too conservative? I know strlen isn't guaranteed to be async-signal safe, meaning that my signal blocking has to occur beforehand, but perhaps I'm mistaken.
  3. I'm fairly sure that it is thread safe, considering that all the functions are thread-safe and that I lock interactions with environ, but I'd love to be proven otherwise.
  4. I'm really not too sure about whether it's reentrant or not. While not guaranteed, I imagine that if I tick the other two boxes it'll most likely be reentrant?

I found another solution to this question here, in which they just set up the appropriate signal blocking and mutex locking (sick rhymes) and then call putenv normally. Is this valid? If so, it's obviously far simpler than my approach.

Sorry about the large block of code, I hope I've established a MCVE. I'm missing a bit of error checking in my code for brevity's sake. Thanks!


Here is the rest of the code, including a main, if you wish to test the code yourself:

#include <string.h>
#include <errno.h>
#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>
#include <signal.h>

// Prototypes
static void thread_init(void);
int putenv_r(char *string);

int
main(int argc, char *argv[]) {

    int ret = putenv_r("mykey=myval");
    printf("%d: mykey = %s\n", ret, getenv("mykey"));

    return 0;
}
Daniel Porteous
  • 5,536
  • 3
  • 25
  • 44
  • Off-topic, but... since C has zero based arrays, it should be `key_len = i;` not `key_len = i - 1;` – Klas Lindbäck Dec 21 '16 at 08:12
  • @KlasLindbäck I use `i - 1` because I want to `strncmp` the *key* part of *key=value*. `i` at the time the loop breaks is at the position of the `=` sign, so I need to subtract 1 from the index. – Daniel Porteous Dec 21 '16 at 08:26
  • 1
    `strncmp` wants the number of characters, not the index of the last character. If you have `foo=bar` then you get `i = 3` and `key_len = 2` and your `strncmp` will only compare the 2 first characters... – Klas Lindbäck Dec 21 '16 at 09:41
  • @KlasLindbäck Ah of course, thanks for pointing that out. Like you said, off topic, but a good error to fix up! – Daniel Porteous Dec 21 '16 at 11:26
  • What does `malloc()` being async-signal-unsafe have to do with blocking signals? I'm missing something. – Andrew Henle Dec 21 '16 at 11:57
  • I might be getting that confused with reentrancy. Isn't it true that if `malloc` is interrupted (by a signal perhaps, hence my confusion) then it cannot be resumed safely because it could mess up the stack?. – Daniel Porteous Dec 21 '16 at 12:01
  • @DanielPorteous If `malloc()` does get interrupted by a signal, that will be invisible to the caller. – Andrew Henle Dec 21 '16 at 12:49
  • Right. I think I was getting confused with not being allowed to call `malloc()` from within a signal handler, in which case not being async-signal safe is a problem. – Daniel Porteous Dec 21 '16 at 12:50

1 Answers1

3

This code is a problem:

// If we get here, the env var didn't already exist, so we add it.
// Note that malloc isn't async-signal safe. This is why we block signals.
environ[i] = malloc(sizeof(char *));
environ[i] = string;

It creates a char * on the heap, assigns the address of that char * to environ[i], then overwrites that value with the address contained in string. That's not going to work. It doesn't guarantee that environ is NULL-terminated afterwards.

Because char **environ is a pointer to an array of pointers. The last pointer in the array is NULL - that's how code can tell it's reached the end of the list of environment variables.

Something like this should work better:

unsigned int envCount;

for ( envCount = 0; environ[ envCount ]; envCount++ )
{
    /* empty loop */;
}

/* since environ[ envCount ] is NULL, the environ array
   of pointers has envCount + 1 elements in it */
envCount++;

/* grow the environ array by one pointer */
char ** newEnviron = realloc( environ, ( envCount + 1 ) * sizeof( char * ) );

/* add the new envval */
newEnviron[ envCount - 1 ] = newEnvval;

/* NULL-terminate the array of pointers */
newEnviron[ envCount ] = NULL;

environ = newEnviron;

Note that there's no error checking, and it assumes the original environ array was obtained via a call to malloc() or similar. If that assumption is wrong, the behavior is undefined.

Andrew Henle
  • 32,625
  • 3
  • 24
  • 56
  • I'm just using `environ` as it is supplied naturally following a call to `execve`, I'm not sure if that entails `malloc`. Thanks for pointing this out though, I had a vibe that the way I was adding a new element to `environ` was insufficient. – Daniel Porteous Dec 21 '16 at 12:04
  • I will update my question to reflect these suggestions, perhaps bypassing the `newEnviron` steps and just working directly with `environ`. I feel the code still has a ways to go however. – Daniel Porteous Dec 21 '16 at 12:11
  • @DanielPorteous If you work directly on `environ` itself, the code won't be thread-safe. Strictly speaking, even `environ = newEnviron;` isn't necessarily thread-safe as updating a pointer value is not necessarily an atomic operation. – Andrew Henle Dec 21 '16 at 12:17
  • Even though I'm using a mutex to control access to environ? As long as I make sure all my threads are respecting these mutexes as well, it will be thread safe yes? If not, if it's not too much trouble could you explain why it wouldn't be? – Daniel Porteous Dec 21 '16 at 12:21
  • @DanielPorteous Yes, even though you're using a mutex. Because your mutex only protects *your* update code, not anything that reads the environment nor any code that updates the environment through other means. Because of the potential for other threads to read the environment, the array of pointers the `environ` points to must be valid at all times. – Andrew Henle Dec 21 '16 at 12:47
  • I see your point. Even if I did happen to be able to confirm all the threads won't misbehave, it's certainly much more robust to minimise the thread-unsafe time. I'll update the code accordingly, thanks. I'll leave the question open for a while longer of course to see if anyone else notices anything, but soon I'll mark your answer as correct, you've been incredibly helpful. – Daniel Porteous Dec 21 '16 at 12:52
  • Update: Seems like environ is indeed not created by malloc by default, the realloc is causing a big "invalid pointer" error. Might have to revert to editing `environ` itself despite the thread safety risks. Not sure how to increase the size of `environ` though. – Daniel Porteous Dec 21 '16 at 13:04
  • @DanielPorteous If you're running glibc/Linux, the source code for `putenv()` can be found at https://fossies.org/dox/glibc-2.24/putenv_8c_source.html and the actual environment implementation at https://fossies.org/dox/glibc-2.24/setenv_8c_source.html – Andrew Henle Dec 21 '16 at 14:23