0

I have a worker thread doing a polling activity (it's targeting libcurl but that is not relevant).

Many actions from the application can start a worker thread but if it is already running, no new thread should be created. A new request from the application is simply joined with any other polling already ongoing.

In summary, a single worker thread that exits when there is nothing left to do.

The code is:

pthread_t thread = NULL;
struct timespec sleep_val = {0, 20000000};
void *worker_thread(void *threadid)
{
    do {
        poll();
        nanosleep(&sleep_val, NULL);
    } while (no_of_handles_running > 0);
    pthread_exit(NULL);
    thread = NULL;
}

void start_worker_thread_if_needed() {
    if (thread == NULL) {
        pthread_create(&thread, NULL, worker_thread, NULL);
    }
}

My doubt is about thread safety. The function start_worker_thread_if_needed() can be called at any time any number of times.

So, what if start_worker_thread_if_needed() is called exactly when we exit the while loop and therefor interrupts the worker thread. If that happens the condition if (thread == NULL) will be FALSE since the worker thread is interrupted and pthread_exit + thread = NULL has not yet completed.

So now start_worker_thread_if_needed() will exit without creating a new thread but as soon as control is returned to the old worker thread it will continue to the pthread_exit line and the worker thread is destroyed.

The issue will be that the request that was just made triggering start_worker_thread_if_needed() will be lost and the polling will not start until the next call to start_worker_thread_if_needed().

Edit: Here is a suggestion with mutex but I still have the same doubt. What happens if the main thread interrupts just after we exit the while loop and before the worker can take the mutex. Then the main thread does not create a new thread and then the worker exit.

void *worker_thread(void *threadid)
{
    do {
        poll();
        nanosleep(&sleep_val, NULL);
    } while (no_of_handles_running > 0);
    pthread_mutex_lock(&my_mutex);
    thread = NULL;
    pthread_mutex_unlock(&my_mutex);
    pthread_exit(NULL);
}

void start_worker_thread_if_needed() {
    pthread_mutex_lock(&my_mutex);
    if (thread == NULL) {
        pthread_create(&thread, NULL, worker_thread, NULL);
    }
    pthread_mutex_unlock(&my_mutex);
}

I think I have a flaw in how I have set this up and it would be highly appreciated if someone could help out with the proper solution.

user1816142
  • 1,199
  • 2
  • 9
  • 16
  • What are you expecting to see? – HoKy22 Feb 17 '16 at 21:45
  • This is what mutexes are for. Also, you can't do anything after you call `pthread_exit` since the thread has exited, so it won't do the `thread = NULL;` operation. – David Schwartz Feb 17 '16 at 21:46
  • My expectation was another version of the code above solving the problem I tried to describe. I though about using mutexes but I did not see how it would fix the problem. – user1816142 Feb 17 '16 at 21:48
  • Yes your doubts are well founded - what you have is called a race condition. Since the `thread` variable is being accessed by multiple threads you need to properly protect access to it. Most common method is with a mutex. See [`pthread_mutex_lock` and `pthread_mutex_unlock`](http://linux.die.net/man/3/pthread_mutex_lock) for more details. – kaylum Feb 17 '16 at 21:50
  • Ignoring the `pthread_exit` problem pointed out by David, and ignoring visibility issues: Imagine if the worker thread sees `no_of_handles_running == 0`, and so it exits the loop, and then the main thread sees there's a worker thread running so it doesn't start one, and then the worker thread exits without seeing the new work. – user253751 Feb 17 '16 at 21:53
  • @DavidSchwartz, great catch with the = NULL after thread_exit. – user1816142 Feb 17 '16 at 21:56
  • @immibis, Exactly, this is the problem I am trying to solve. Main thread think there is already a worker since it was interrupted before setting thread=NULL and therefore does not start a new one. This is what I am trying to solve. – user1816142 Feb 17 '16 at 21:56
  • @user1816142 Using a mutex can fix the problem because it ensures only one thread will access the `thread` variable at any given time. The parts of the code that access a shared resource is referred to as the "critical section". All critical sections need to be protected by locking the mutex preceding the CS and unlocking after the CS. – kaylum Feb 17 '16 at 21:56
  • @user1816142 That's exactly the problem a mutex would solve. – David Schwartz Feb 17 '16 at 21:58
  • @DavidSchwartz, Thanks for trying to help guys. Maybe I am just tired, I added an example above with a mutex but I still fear I might have the same problem. – user1816142 Feb 17 '16 at 22:08
  • Your `pthread_mutex_unlock` is in the wrong spot in `start_worker_thread_if_needed`. It needs to go after the `if` block (ie, you must always unlock the mutex). Other than that it should work fine. Oh, assuming you have correctly initialised `my_mutex` of course (that is not shown). – kaylum Feb 17 '16 at 22:11
  • Why exit the thread? why not have the thread sleep while there's nothing to do? I would consider using a blocking pipe IO to send messages to the thread. The thread will simply block on an IO read until it receives a task through the pipe. You ca look into [this library](https://github.com/boazsegev/c-server-tools/blob/master/lib/libasync.c) as an example for using pipes for thread communication. – Myst Feb 17 '16 at 22:16
  • @kaylum, sorry about the unlock, that was a silly mistake. The init is not shown, I just did this as a quick example. I still have the feeling that if that if we exit the loop but get interrupted before the worker thread can take the lock, then the main thread will take the lock, see that there is already a worker thread running and do nothing. Then the worker can continue but now it is outside the while loop, it sets the worker to NULL and exits. However the previous call from the main thread is lost right since no worker was started and the old one had already left the while loop :-) – user1816142 Feb 17 '16 at 22:27
  • It sounds like you could be right. But it is difficult to recommend a full solution for you because your code is incomplete. In particular, it appears that `no_of_handles_running` is also a shared variable that can be accessed by multiple threads which would mean it also needs to be protected. Also, as asked above already..why do you exit the worker thread at all? Why not just let it wake up periodically to do its polling (or better still, signal it to wake up when there is work to be done)? – kaylum Feb 17 '16 at 22:33
  • no_of_handles_running is safe it is only used inside the poll function which is only called by the worker thread so that part is ok. For the frequency I need it to be super responsive when the is work to do so I do not want a longer sleep. On the other hand jobs might be infrequent so I do not want a thread doing 20 msec polling if not needed. – user1816142 Feb 17 '16 at 22:38
  • Ok but there must be something inside the poll function which is shared with other threads. Because other threads need to queue up work for the poll function. Is that not the case? Also, for responsiveness you should not be using polling at all then. Your code should be event driven - that is, the worker thread is woken up exactly when there is something to do. – kaylum Feb 17 '16 at 22:41
  • The signal part of your comment is interesting. This is for a cross platform project. Is there such a mechanism built into posix? – user1816142 Feb 17 '16 at 22:41
  • Yes the active handles value is provided by libcurl. I ask for it inside the poll function. – user1816142 Feb 17 '16 at 22:43
  • Indeed libcurl can be used with libev but my code currently does not support that version. – user1816142 Feb 17 '16 at 22:44
  • Again, it is difficult to tell you what to do without the full context of your requirements. There are many different ways to implement event driven logic. For example, if you can monitor file descriptors then [`poll`](http://linux.die.net/man/2/poll) or [`select`](http://linux.die.net/man/2/select) can be used. If your requirements are broader then another option is condition variables - see [`pthread_cond_wait`](http://linux.die.net/man/3/pthread_cond_wait) and [`pthread_cond_signal`](http://linux.die.net/man/3/pthread_cond_signal). – kaylum Feb 17 '16 at 22:47
  • Thank you so much for your suggestions. I will read up on condition variables first thing tomorrow morning (it is midnight here). – user1816142 Feb 17 '16 at 22:53
  • @kaylum. Many thanks for the pointer on condition variables. I think it did exactly what I needed. Posted final code below. – user1816142 Feb 18 '16 at 10:34

1 Answers1

0

Thanks to for all the help in the comments above. Kaylums suggestion to use condition variables seems to be a good approach to not risk ending up with the corner case I was concerned about.

The final code (with logs removed) is the one below and it appears to work just fine.

static int no_of_handles_running;
static int RUN_THREAD = 0;
pthread_t thread = NULL;
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cv  = PTHREAD_COND_INITIALIZER;;

void *worker_thread(void *threadid)
{
    RUN_THREAD = 1;
    do {
        poll();  //Will update no_of_handles_running
        if (no_of_handles_running == 0) {
            pthread_mutex_lock(&mutex);
            pthread_cond_wait(&cv, &mutex);
            pthread_mutex_unlock(&mutex);
        }
    } while (RUN_THREAD);
    thread = NULL;
    pthread_exit(NULL);

}
void start_worker_thread() {
    if (thread == NULL) {
        int rc = pthread_create(&thread, NULL, worker_thread, NULL);
        if (rc){
            return;
        } 
    }
    pthread_mutex_lock(&mutex);
    pthread_cond_signal(&cv);
    pthread_mutex_unlock(&mutex);
}

void stop_worker_thread() {
    RUN_THREAD = 0;
    pthread_mutex_lock(&mutex);
    pthread_cond_signal(&cv);
    pthread_mutex_unlock(&mutex);
}
user1816142
  • 1,199
  • 2
  • 9
  • 16