0

What this program is trying to accomplish:

This program is supposed to synchronize several threads of "visitors" and "cars". Visitors wander around for a random amount of time until they decide to go on a car ride. If they are first in line for a car ride and a car is available they take a ride, else they must wait till they are first in line or a car comes back. If there are no visitors in line the cars wait in order until a visitor wants to go on a ride.

More background info:

I remade my thread synchronization program using condition variables as suggested in the accepted answer here. I know I am on the right track but my program still deadlocks for some reason and for the life of me I cannot figure out why. I don't know how you can help me unless I give you the code, so here it is:

Problems:

1.) deadlock after a short bit

2.) sometimes a visitor is first in line for a car, but never gets in a car.

The Solution:

There were just too many bugs in my code... and I think as I would fix one I was often (inadvertently) introducing another one. I just kept removing features of the program until I had eliminated all the bugs, then built features back in one by one in a manner that would not deadlock my program. Thank you all for your suggestions.

Community
  • 1
  • 1
ubiquibacon
  • 10,451
  • 28
  • 109
  • 179
  • 1
    Walls of code like this usually don't get much attention. Try reducing it to a minimal subset that still exhibits the problem. – Marcelo Cantos Dec 07 '10 at 21:36
  • Please describe what this program is trying to accomplish. What procedure is it modeling? What rules must it follow? What invariants must not be violated? What do the start and final states look like? – Jeremy W. Sherman Dec 07 '10 at 21:46
  • what's the backtrace from the deadlock? that's often helpful – Havoc P Dec 07 '10 at 22:15

3 Answers3

5

Firstly, xscott is correct that you're using mutexes incorrectly. It doesn't matter if it appears to work for a short while, it's still wrong, and is probably only appearing to work due to sheer chance.

Rather than go through your code line-by-line, I think the best approach is to build up the design from first principles. I would describe the basic algorithm that I think you're after like this:

visitor {
    sleep
    join end of visitor queue
    wait until at head of visitor queue
    wait until there is a car free
    remove car from car queue
    remove self from visitor queue
    occupy car
    wait until not in car anymore
}

car {
    join end of car queue
    wait until occupied
    sleep
    eject visitor from car
}

Note that I don't explicitly mark the wakeup points - just the waits. This is the best approach - figure out where you need to wait for something to change state, then you just need to put the wakeups (signalling the condition variable) whenever that state changes.

The next step would be to identify the major shared data that needs to be protected by mutexes. I see:

- The visitor queue;
- The car queue;
- The status of each car.

So the most granular approach would be to have one mutex for the visitor queue (we can use your v_mutex), one for the car queue (c_mutex) and one for each car (sc[CARS]). Alternatively, you could use c_mutex to protect the car queue and the status of every car; or just use v_mutex to protect everything. But for the sake of learning, we'll go with the more complicated approach.

The next step is to identify the waiting points where condition variables will be useful. For wait until at head of visitor queue, your per-visitor condition variables (v_cond) will be fine; for wait until there is a car free it will be easiest to add another condition variable (v_car_cond). For wait until occupied the per-car condition variables c_cond are appropriate. For wait until not in car anymore either v_cond or c_cond could be used, since there's a 1-to-1 nexus between cars and visitors at that point. There is no need for v_cond2.

We can now re-write the pseudo-code above in terms of pthreads primitives. In most cases this is very close to what you already had - so you were definitely on the right track. First the visitor:

    /* join end of visitor queue */
    pthread_mutex_lock(&v_mutex);
    v_line[v_id] = set_visitor_place_in_line();
    printf("Visitor %d is %d in line for a ride\n", v_id, v_line[v_id]);
    pthread_mutex_unlock(&v_mutex);

    /* wait until first in line */
    pthread_mutex_lock(&v_mutex);
    while (v_line[v_id] != 1) {
        pthread_cond_wait(&v_cond[v_id], &v_mutex);
    }
    pthread_mutex_unlock(&v_mutex);

    /* wait until there is a car free */
    pthread_mutex_lock(&c_mutex);
    while ((car = get_next_car()) == CARS) {
        pthread_cond_wait(&v_car_cond, &c_mutex);
    }
    pthread_mutex_unlock(&c_mutex);

    /* remove car from car queue */
    pthread_mutex_lock(&c_mutex);
    move_car_line();
    /* NOTE: We do not need to signal v_car_cond here, because only the first
     * visitor in line can be waiting there, and we are still the first visitor
     * in line at this point. */
    pthread_mutex_unlock(&c_mutex);

    /* remove self from visitor queue */
    pthread_mutex_lock(&v_mutex);
    move_passenger_line();
    next_v = get_next_passenger();
    if (next_v < VISITORS)
        pthread_cond_signal(&v_cond[next_v]);
    pthread_mutex_unlock(&v_mutex);

    /* occupy car */
    pthread_mutex_lock(&sc[car]);
    c_state[car] = v_id;
    pthread_cond_signal(&c_cond[car]);
    pthread_mutex_unlock(&sc[car]);

    /* wait until not in car anymore */
    pthread_mutex_lock(&sc[car]);
    while(c_state[car] == v_id) {
        pthread_cond_wait(&v_cond[v_id], &sc[car]);
    }
    pthread_mutex_unlock(&sc[car]);

Second, the car:

    /* join end of car queue */
    pthread_mutex_lock(&c_mutex);
    c_line[c_id] = set_car_place_in_line();
    if (c_line[c_id] == 1)
        pthread_cond_signal(&v_car_cond);
    pthread_mutex_unlock(&c_mutex);

    /* wait until occupied */
    pthread_mutex_lock(&sc[c_id]);
    while ((v_id = c_state[c_id]) == VISITORS) {
        pthread_cond_wait(&c_cond[c_id], &sc[c_id]);
    }
    pthread_mutex_unlock(&sc[c_id]);

    /* visitor is on car ride for random amount of time */
    sleep(rand()%5);

    /* eject visitor from car */
    pthread_mutex_lock(&sc[c_id]);
    c_state[c_id] = VISITORS;
    pthread_cond_signal(&v_cond[v_id]);
    pthread_mutex_unlock(&sc[c_id]);

The above can be simplified - wherever there is a pthread_mutex_unlock() followed immediately by a pthread_mutex_lock() of the same mutex, the unlock/lock pair can be removed.

PS:

You shouldn't worry about your cars joining the car queue in the "wrong order" - they're going to get out of order as they trundle around the park anyhow. If you really care about this, put the cars into the queue in the main thread before starting any of the car threads, and change the car code so that it re-adds itself to the queue at the end of its main loop.


The overall code with this approach, leaving out your global variables and helper functions which are fine, looks like this:

pthread_mutex_t c_mutex = PTHREAD_MUTEX_INITIALIZER; /* mutex protecting c_line and cars_out */
pthread_mutex_t v_mutex = PTHREAD_MUTEX_INITIALIZER; /* mutex protecting v_line */
pthread_cond_t c_cond[CARS]; /* condition variables for waiting for c_state[i] to change from VISITORS to another value */
pthread_cond_t v_cond[VISITORS]; /* condition variables for visitor waiting to be first in line or ejected from a car */
pthread_cond_t v_car_cond = PTHREAD_COND_INITIALIZER; /* Condition variable for a visitor first in line, but waiting for a car. */
pthread_mutex_t sc[CARS]; /* one mutex per car, sc[i] protects c_state[i] */

int main(){
    int i;
    void *status;
    pthread_t c[CARS];
    pthread_t v[VISITORS];

    srand(time(NULL));

    printf("Jurassic Park is open, cars are being prepped for passengers\n");

    for (i = 0; i < CARS; i++){
        pthread_mutex_init(&sc[i], NULL);
        pthread_cond_init(&c_cond[i], NULL);
    }

    for (i = 0; i < VISITORS; i++){
        pthread_cond_init(&v_cond[i], NULL);
    }

    for (i = 0; i < CARS; i++){
        c_state[i] = VISITORS;
        pthread_create(&c[i], NULL, car, (void *)i);
    }

    for (i = 0; i < VISITORS; i++){
        pthread_create(&v[i], NULL, visitor, (void *)i);
    }

    for (i = 0; i < VISITORS; i++){
        pthread_join(v[i], &status);
    }

    museum_empty++;

    printf("All visitors have left Jurassic Park\n");

    for (i = 0; i < CARS; i++){
        pthread_mutex_lock(&sc[i]);
        c_state[i] = -1;
        pthread_cond_signal(&c_cond[i]);
        pthread_mutex_unlock(&sc[i]);
    }

    for (i = 0; i < CARS; i++){
        pthread_join(c[i], &status);
    }

    printf("Jurrasic Park is closed, all cars have been parked\n");

    pthread_exit(NULL);

    return 0;
}

void *car(void *i)
{
    int c_id = (int) i;
    int v_id;

    while (museum_empty != 1) {

        /* join end of car queue */
        pthread_mutex_lock(&c_mutex);
        c_line[c_id] = set_car_place_in_line();
        if (c_line[c_id] == 1)
            pthread_cond_signal(&v_car_cond);
        printf("Car %d is ready for a passenger and is %d in line                        %d of %d cars are out\n", c_id, c_line[c_id], cars_out, CARS);
        pthread_mutex_unlock(&c_mutex);

        /* wait until occupied */
        pthread_mutex_lock(&sc[c_id]);
        while ((v_id = c_state[c_id]) == VISITORS) {
            pthread_cond_wait(&c_cond[c_id], &sc[c_id]);
        }
        pthread_mutex_unlock(&sc[c_id]);

        if(museum_empty == 1){
            break;
        }

        pthread_mutex_lock(&c_mutex);
        cars_out++;
        printf("Visitor %d is riding in car %d                                           %d of %d cars are out --\n", v_id, c_id, cars_out, CARS);
        pthread_mutex_unlock(&c_mutex);

        /* visitor is on car ride for random amount of time */
        sleep(rand()%5);

        printf("Visitor %d is  done riding in car %d\n", v_id, c_id);

        /* eject visitor from car */
        pthread_mutex_lock(&sc[c_id]);
        c_state[c_id] = VISITORS;
        pthread_cond_signal(&v_cond[v_id]);
        pthread_mutex_unlock(&sc[c_id]);

        pthread_mutex_lock(&c_mutex);
        cars_out--;
        pthread_mutex_unlock(&c_mutex);
    }

    return NULL;
}

void *visitor(void *i)
{
    int v_id = (int) i;
    int next_v;
    int j = 0;
    int car;

    while (j < RIDES) {
        if (j == 0) {
            printf("Visitor %d entered museum and is wandering around\n", v_id);
        } else {
            printf("Visitor %d got back from ride and is wandering around\n", v_id);
        }

        sleep(rand()%3); /* visitor is wandering */

        /* join end of visitor queue */
        pthread_mutex_lock(&v_mutex);
        v_line[v_id] = set_visitor_place_in_line();
        printf("Visitor %d is %d in line for a ride\n", v_id, v_line[v_id]);

        /* wait until first in line */
        while (v_line[v_id] != 1) {
            pthread_cond_wait(&v_cond[v_id], &v_mutex);
        }
        pthread_mutex_unlock(&v_mutex);

        /* wait until there is a car free */
        pthread_mutex_lock(&c_mutex);
        while ((car = get_next_car()) == CARS) {
            pthread_cond_wait(&v_car_cond, &c_mutex);
        }

        /* remove car from car queue */
        move_car_line();
        pthread_mutex_unlock(&c_mutex);

        /* remove self from visitor queue */
        pthread_mutex_lock(&v_mutex);
        move_passenger_line();
        next_v = get_next_passenger();
        if (next_v < VISITORS)
            pthread_cond_signal(&v_cond[next_v]);
        pthread_mutex_unlock(&v_mutex);

        /* occupy car */
        pthread_mutex_lock(&sc[car]);
        c_state[car] = v_id;
        pthread_cond_signal(&c_cond[car]);

        /* wait until not in car anymore */
        /* NOTE: This test must be against v_id and *not* VISITORS, because the car could have been
         * subsequently occupied by another visitor before we are woken. */
        while(c_state[car] == v_id) {
            pthread_cond_wait(&v_cond[v_id], &sc[car]);
        }
        pthread_mutex_unlock(&sc[car]);

        j++;
    }

    printf("Visitor %d leaving museum.\n", v_id);
    return NULL;
}

I hope this is helpful...

caf
  • 233,326
  • 40
  • 323
  • 462
  • I didn't use all of your suggestions, but I used some of them, changed a few other things, and finally got it working right. Thanks for taking the time to go through my code and give such a detailed answer. – ubiquibacon Dec 10 '10 at 03:32
4

You've got a lot of code, so it's unlikely that anyone is going to find all of the bugs for you. However, a few comments:

Mutexes are not semaphores. Several of your for loops in main() are unlocking a mutex that you haven't locked yet. This is almost certainly an error. Conceptually, mutexes could be implemented with semaphores, and you could implement a semaphore with a mutex and condvar, but you're unlocking an unlocked mutex, and this is incorrect.

Each thread should lock a mutex, do some work, and then unlock it. A thread should not unlock a mutex which has been locked by another thread, or preemptively unlock a mutex that it didn't lock. If this works at all, it's a quirk in the implementation you are currently using and not portable to other implementations.

Your second for loop in main locks the same mutex twice in a row. Are you getting past this point in the code? Since you're looping, you're locking it more than you're unlocking it. I wouldn't be surprised if your code stops here. (Sometimes mutexes can be recursive, but pthread mutexes are not by default.)

pthread_cond_signal() is really just an optimization over pthread_cond_broadcast(). Use broadcast until you get your race conditions worked out.

You should initialize all of your mutexes and condvars at the top of main before you start your threads. It's possible that you don't have an error here, but it won't hurt, and it could help.

You might do better if you reduce everything down to a single mutex and a single condvar for the short term. It looks like you're trying to do fine grained locking with everything, but unless you're really careful about the order of your locks, you're going to get race conditions and deadlock.

Really, there is only one pattern/template you should use with mutexes and condvars:

pthread_mutex_lock(...);

// optionally wait for something to be true
while (!some_condition) {
    pthread_cond_wait(...);
}

// make changes for how things should now be
shared_variable = new_value;

// optionally notify the rest of the world of your change
pthread_cond_broadcast(...);

pthread_mutex_unlock(...);

If you have a single mutex and condvar, you should try to make all of synchronization blocks look like that. You can omit the while(...)/wait stuff if you don't need to wait, and you can omit the broadcast if no other threads care about the change you've made, but if your code doesn't roughly look like that for each synchronization block, it's probably a bug.

xscott
  • 2,350
  • 16
  • 18
  • @xscott I appreciate your input, but I addressed every comment you made in the comments of my code, specifically saying why I did thing the way I did. You said "Mutexes are not semaphores" but a mutex can be used as a semaphore. I explained in my code why I locked the loops in my main method the way I did. You also said "A thread should not unlock a mutex which has been locked by another thread" but a thread unlocking a mutex that has BLOCKED another thread is totally valid, and in that case the mutex in question is a semaphore. My code runs, if you run it you will see how far it get. – ubiquibacon Dec 07 '10 at 22:20
  • @typoknig, feel free to ignore my suggestions, but your comments don't explain why you're using mutexes incorrectly, and you apparently do have a problem or two. I think your most likely problem is that you believe pthread mutexes behave like semaphores. I'd suggest that you write some small test programs to check your assumptions, and be sure to suggest the return results on your pthread calls. – xscott Dec 07 '10 at 22:20
  • @xscott, when using pthreads there is no semaphore, there is only mutex. The way you use a mutex can mimic a semaphore. The very first comment in my main method explains why I used mutexes the way I did when creating my threads. Obviously I am using mutexes incorrectly or else my program would not be locking up, but I have not used them in correctly in any of the places you have mentioned. Also if you check the my previous question (linked to in this question) you will see that I did already write a small program to test a small version of this structure. – ubiquibacon Dec 07 '10 at 22:27
  • @typoknig, it's pretty obvious that this is a homework problem and that you're still learning. This makes it a little bit curious to me why you are so sure of yourself. If you look at the man page for pthread_mutex_unlock(), you'll see that there is an error return code for unlocking a mutex which is locked by a different thread. You are using them incorrectly. However, since you aren't very receptive to help, I don't really care if you fail. – xscott Dec 07 '10 at 22:39
  • @xscott it was a homework problem yesterday, but my professor gave us a choice of several problems. Since I was hitting the wall on this one I did another one and turned it in. Still, I have so much time into this that I want to know what is not working correctly. Since we are online I think you are reading me incorrectly, so I will not take offense at your comment about me failing. I am plenty receptive to help, but you have not give me anything to be receptive to. The program runs for several loops before locking up, so obviously the problem is not in the main method. – ubiquibacon Dec 07 '10 at 22:50
  • @xscott using `signal` vs `broadcast` is irrelevant in this case. Initializing mutexs at the top of my code is also irrelevant as it can be done just fine in the main method (I have done it both ways, no change). I have also tried (yesterday) using only one mutex for the entire program and still had a deadlock. That covers all of your suggestions. I would not post a big nasty block of code for others to look at unless I believed I had exhausted all possibilities on my end. Also check my other question and run the WORKING code I have posted there, you will see that I am using your pattern. – ubiquibacon Dec 07 '10 at 22:55
  • I'm going to have to second xscott's suggestion. In your working code in your other question you lock first, then have a matching unlock. In this code, your car() function immediately unlocks the mutex. If it was locked, it was locked or a reason. That may not be the problem, but there are so many lock and unlocks with different mutexes that it's difficult to keep track. Clean up the code. Make it simpler. – SaulBack Dec 07 '10 at 23:15
  • @typoknig, sorry for my snarky comment about failing, but it's really tough to read your tone in these messages. Anyways, I think eventually you'll see that my comments are not irrelevant, but until then, hopefully someone else will give you the answer you're trying to find. Cheers. – xscott Dec 07 '10 at 23:16
  • To expand on xscott's answer, attempting to unlock a mutex that is not locked or was locked by a different thread results in either an error (if it's an error-checking mutex) or **undefined behavior**. In no case is it ever a valid way to unlock the mutex! – R.. GitHub STOP HELPING ICE Dec 07 '10 at 23:24
1

I think you're more comfortable with semaphores. Here's an implementation of semaphores in terms of mutexes and condvars:

typedef struct {
    pthread_mutex_t mutex;
    pthread_cond_t condvar;
    unsigned long count;
} semaphore_t;

void semaphore_init (semaphore_t* sem, unsigned long count) {
    pthread_mutex_init(&sem->mutex, 0);
    pthread_cond_init(&sem->condvar, 0);
    pthread_mutex_lock(&sem->mutex);
    sem->count = count;
    pthread_mutex_unlock(&sem->mutex);
}

void semaphore_incr (semaphore_t* sem) {
    pthread_mutex_lock(&sem->mutex);
    sem->count++;
    pthread_cond_broadcast(&sem->condvar);
    pthread_mutex_unlock(&sem->mutex);
}

void semaphore_decr (semaphore_t* sem) {
    pthread_mutex_lock(&sem->mutex);
    while (sem->count == 0) {
        pthread_cond_wait(&sem->condvar, &sem->mutex);
    }
    sem->count--;
    pthread_mutex_unlock(&sem->mutex);
}

Maybe if you implement your solution in terms of semaphores, you'll get the results you're after.

xscott
  • 2,350
  • 16
  • 18