Control the order of the threads ending

Question

I have 4 Threads (Thread_A - Thread_D). I want them to end in the order A, B, C, D. It must be solved with semaphores.

Whats wrong with my code? In most cases it's fine, but sometimes it's in the wrong order.

#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <semaphore.h>

void* thread_A (void *arg);
void* thread_B (void *arg);
void* thread_C (void *arg);
void* thread_D (void *arg);

typedef struct sem_package {
    sem_t* sem1;
    sem_t* sem2;
} sem_package;

void* thread_A (void *arg) {
    sem_t* sem = (sem_t*) arg;

    int random_sleep = (int) (500 + ((random()) / RAND_MAX) * 2000);
    struct timespec sleep_time;
    sleep_time.tv_sec = random_sleep / 1000;
    sleep_time.tv_nsec = (random_sleep % 1000) * 1000000;

    nanosleep(&sleep_time, NULL);

    sem_post(sem);  //unblock B
    printf("\nA\n");
    return NULL;
}

void* thread_B (void *arg) {
    sem_package* pack = (sem_package*) arg;

    int random_sleep = (int) (500 + ((random()) / RAND_MAX) * 2000);
    struct timespec sleep_time;
    sleep_time.tv_sec = random_sleep / 1000;
    sleep_time.tv_nsec = (random_sleep % 1000) * 1000000;

    nanosleep(&sleep_time, NULL);

    sem_wait(pack->sem1);   //wait for A
    sem_post(pack->sem2);   //unblock C
    printf("\nB\n");
    return NULL;    
}

void* thread_C (void *arg) {
    sem_package* pack = (sem_package*) arg;

    int random_sleep = (int) (500 + ((random()) / RAND_MAX) * 2000);
    struct timespec sleep_time;
    sleep_time.tv_sec = random_sleep / 1000;
    sleep_time.tv_nsec = (random_sleep % 1000) * 1000000;

    nanosleep(&sleep_time, NULL);

    sem_wait(pack->sem2);   //wait for B
    sem_post(pack->sem1);   //unblock D
    printf("\nC\n");
    return NULL;
}

void* thread_D (void *arg) {
    sem_t* sem = (sem_t*) arg;

    int random_sleep = (int) (500 + ((random()) / RAND_MAX) * 2000);
    struct timespec sleep_time;
    sleep_time.tv_sec = random_sleep / 1000;
    sleep_time.tv_nsec = (random_sleep % 1000) * 1000000;

    nanosleep(&sleep_time, NULL);

    sem_wait(sem);  //wait for C
    printf("\nD\n");

    return NULL;
}

int main () {
    srandom((unsigned int) time(NULL)); 

    pthread_t threadA, threadB, threadC, threadD;

    sem_t sem_A_B;
    sem_t sem_C_D;
    sem_t sem_B_C;

    sem_init(&sem_A_B,0,0);
    sem_init(&sem_C_D,0,0);
    sem_init(&sem_B_C,0,0);

    struct sem_package pack1;
    pack1.sem1=&sem_A_B;
    pack1.sem2=&sem_B_C;

    struct sem_package pack2;
    pack2.sem1=&sem_C_D;
    pack2.sem2=&sem_B_C;

    pthread_create(&threadA,NULL,thread_A,&sem_A_B);
    pthread_create(&threadB,NULL,thread_B,&pack1);
    pthread_create(&threadC,NULL,thread_C,&pack2);
    pthread_create(&threadD,NULL,thread_D,&sem_C_D);

    long m;
    pthread_join(threadD, (void **) &m);

    sem_destroy(&sem_A_B);
    sem_destroy(&sem_B_C);
    sem_destroy(&sem_C_D);

    pthread_detach(threadA);
    pthread_detach(threadB);
    pthread_detach(threadC);

    return 0;
}

@n.m. because of the printf in the functions. Yea its true , there could happen an sem_post and right after it could loose his turn and another thread could start working because of the post — Muco, May 07 '16 at 19:09
One, important trick with mutiple threads is to design so that such things don't matter. @n.m. has alredy pointed out how difficult it is to ensure such things. Best to not bother trying. — Martin James, May 07 '16 at 19:11
`printf` is not synchronized in any way and output can appear in any order. — n. m. could be an AI, May 07 '16 at 19:11
@MartinJames of course you are right, but i have to do it because of the school.. — Muco, May 07 '16 at 19:14
what can i use to check in which order they ended? Is printf before post enough? — Muco, May 07 '16 at 19:16
This is tangential to the issue of the order in which the threads finish, but one thing that's wrong is that you don't add the thread name to the `struct sem_package` so that you can have just a single thread function that gets all the information it needs from the arguments passed in. You'd have a little cleanup work to do, but it would reduce your code quite dramatically. — Jonathan Leffler, May 07 '16 at 19:19
@Muco : you could try pointing `pthread_create` to a proxy function that actually calls the real thread function saved in a `struct` passed into it as a parameter. This way, you could set up a debugger to break when leaving that proxy function. However, the execution order may be different when running under a debugger... — Daniel Kamil Kozar, May 07 '16 at 19:19
Printing plus `fflush()` after `sem_wait()` and before `sem_post()` would give you a decent chance. — Jonathan Leffler, May 07 '16 at 19:20
If `main` waits for thread A to finish with `pthread_join`, and then posts the semaphore for B, then you would know that A was completely finished before B. — user3386109, May 07 '16 at 19:24
@JonathanLeffler did exactly what you said, now it's always in the same order. I don't understand why, but it works. — Muco, May 07 '16 at 19:42
It works because (1) the threads can't print until they've received the 'go' signal from their predecessor, if they have a predecessor, and (2) only one thread can be printing at a time because they print when they have permission to go and before they grant the next thread permission to go. I think you could drop the `fflush()`; they're all using the same stream, one at a time in sequence, so the output should be deterministically interleaved in the correct order even without the `fflush()`. — Jonathan Leffler, May 07 '16 at 21:46
After a bit of checking, the `fflush()` calls seem to be necessary. — Jonathan Leffler, May 07 '16 at 22:01

Jonathan Leffler · Answer 1 · 2016-05-08T17:37:50.727

In the comments, I noted:

This is tangential to the issue of the order in which the threads finish, but one thing that's wrong is that you don't add the thread name to the struct sem_package so that you can have just a single thread function that gets all the information it needs from the arguments passed in. You'd have a little cleanup work to do, but it would reduce your code quite dramatically.

I also suggested:

Printing plus fflush() after sem_wait() and before sem_post() would give you a decent chance [of getting the thread termination messages printed in the correct order].

To which Muco commented:

[I] did exactly what you said, now it's always in the same order. I don't understand why, but it works.

And I responded:

It works because (1) the threads can't print until they've received the 'go' signal from their predecessor, if they have a predecessor, and (2) only one thread can be printing at a time because they print when they have permission to go and before they grant the next thread permission to go.

I mentioned the possibility of not using fflush(), but that seems to be a bad idea (experimenting with GCC 6.1.0 on Mac OS X 10.11.4). I have not yet worked out why the fflush() calls are necessary (but see section on 'Always check for errors' below).

Here's some code that implements one thread function that serves for all. It uses a C99 feature (designated initializers) to initialize the time structure. The #pragma at the top suppresses the warnings about sem_init() and sem_destroy() being deprecated on Mac OS X (see Why are sem_init(), sem_getvalue(), sem_destroy() deprecated on Mac OS X — and what replaces them? for the details).

/* Pragma needed on Mac OS X to suppress warnings about sem_init() and sem_destroy() */
#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <semaphore.h>

static void *thread_function(void *arg);

typedef struct sem_package
{
    sem_t *sem1;
    sem_t *sem2;
    char  *name;
} sem_package;

static
void *thread_function(void *arg)
{
    sem_package *pack = (sem_package *)arg;

    int random_sleep = (int)(500 + ((random()) / RAND_MAX) * 2000);
    struct timespec sleep_time = { .tv_sec  = random_sleep / 1000,
                                   .tv_nsec = (random_sleep % 1000) * 1000000
                                 };

    nanosleep(&sleep_time, NULL);

    if (pack->sem1)
        sem_wait(pack->sem1);   // wait for predecessor
    printf("\n%s\n", pack->name);
    fflush(stdout);
    if (pack->sem2)
        sem_post(pack->sem2);   // unblock successor
    return NULL;
}

int main(void)
{
    srandom((unsigned int)time(NULL));

    pthread_t threadA, threadB, threadC, threadD;

    sem_t sem_A_B;
    sem_t sem_C_D;
    sem_t sem_B_C;

    sem_init(&sem_A_B, 0, 0);
    sem_init(&sem_C_D, 0, 0);
    sem_init(&sem_B_C, 0, 0);

    struct sem_package pack1 = { NULL,     &sem_A_B, "A" };
    struct sem_package pack2 = { &sem_A_B, &sem_B_C, "B" };
    struct sem_package pack3 = { &sem_B_C, &sem_C_D, "C" };
    struct sem_package pack4 = { &sem_C_D, NULL,     "D" };

    pthread_create(&threadA, NULL, thread_function, &pack1);
    pthread_create(&threadB, NULL, thread_function, &pack2);
    pthread_create(&threadC, NULL, thread_function, &pack3);
    pthread_create(&threadD, NULL, thread_function, &pack4);

    void *vp;
    pthread_join(threadD, &vp);

    sem_destroy(&sem_A_B);
    sem_destroy(&sem_B_C);
    sem_destroy(&sem_C_D);

    pthread_detach(threadA);
    pthread_detach(threadB);
    pthread_detach(threadC);

    printf("\nAll done\n\n");
    return 0;
}

Sample output:

A

B

C

D

All done

Always check for errors

To try and work out why I was seeing erratic behaviour without fflush(), I added relatively comprehensive error checking on the system calls. The initial run was a sober reminder of why it is important to check the return values from system calls:

$ ./pthread-37
pthread-37: sem_init(): error (78) Function not implemented
$

Frankly, I'd rather the system didn't provide the entry point that says "I'll pretend it's here but it isn't really here". This is an odd meaning for 'deprecated' too; normally, that means "it is present (and works) but it may go missing in the future". However, there is at least a solid explanation for why the semaphores didn't seem to be enforcing order — they don't exist, so they can't enforce the order.

All this applies to the Mac. When I tried it in an Ubuntu 14.04 LTS VM with GCC 4.8.4, then the code works correctly — even with the error checking and without the fflush() calls. This is the sane behaviour.

Object lessons:

Check return values from system calls.
Mac OS X does not implement sem_init().

You moved `sem_post` after `printf`. WIth that change, `fflush` should no longer be necessary, — Employed Russian, May 07 '16 at 23:22
@EmployedRussian: I agree, but…When I run the code without the `fflush()` on a Mac, I get the output `A B D C All done` (spread over multiple lines). That may indicate a flaw in the Mac OS X implementation of pthreads, but it is empirically what I got on the first (re)run without `fflush()`. As I said, I don't understand why the `fflush()` was necessary, but it seems to be necessary. — Jonathan Leffler, May 07 '16 at 23:30
@JonathanLeffler Yesterday i only tried with flush, but now i also tried without the flush(). It also works without flush() for me on ubuntu. Also thanks for the code below. — Muco, May 08 '16 at 07:55
@EmployedRussian: See my update — Mac OS X has a dummy entry point for `sem_init()` which simply returns 'not implemented' as an error. Not very helpful. — Jonathan Leffler, May 08 '16 at 17:41

score 0 · Answer 2 · answered May 07 '16 at 21:33

I want them to end in the order A, B, C, D.

You have to be a bit more precise on what you mean by that. Consider the following possible operation interleaving (time increases from top to bottom):

Thread A                      Thread B
sem_post(sem); //unblock B
                              sem_wait(pack->sem1);   //wait for A
                              sem_post(pack->sem2);   //unblock C
                              printf("\nB\n");
                              return NULL;    
printf("\nA\n");
return NULL;

If above interleaving were to take place, you will have B printed before A and thread B "ending" (reaching return) before thread A.

Now consider this other interleaving:

Thread A                      Thread B
sem_post(sem); //unblock B
                              sem_wait(pack->sem1);   //wait for A
                              sem_post(pack->sem2);   //unblock C
                              printf("\nB\n");
printf("\nA\n");
return NULL;
                              return NULL;

Here you still have B printed before A, but now thread A "ends" before thread B does. You can also have A printed before B with thread A ending before or after thread B (I leave the interleaving as an exercise to you).

So what's the answer?

What you are probably interested is not the "thread A ends before thread B", but "thread A ends its useful work before thread B". If we define useful work as printing the A or B, then to fix your problem you have to move the printf calls before you wake up another thread.

Control the order of the threads ending

2 Answers2

Always check for errors