0

I'm writing a naive implementation of a single producer multiple consumer buffer with pthreads and condition variables, using C++ list as a buffer. I'm not too much worried about how fast my code runs, I just want to get rid of the errors.

The producer thread reads a string from a file and insert it into the end of a buffer while each one of the consumers reads from the beginning and put it on a different matrix. So, essencialy, I have a FIFO queue which has a max size and the first element can only be erased when all the consumers have already read it.

Here's the important part of the three functions of my code:

PRODUCER:

void *feedBuffer(void *threadproducer){
//some declarations...
while(!file->eof())
{   
    pthread_mutex_lock(&mutex);

    while(*buffer_current_size == buffer_max_size) { // full
        // wait until some elements are consumed
        pthread_cond_wait(&can_produce, &mutex);
    }

    pthread_mutex_lock(&lock_buffer); 
        *file >> temp.word;
        buffer->push_back(temp);
        (*buffer_current_size)++;
    pthread_mutex_unlock(&lock_buffer);

    pthread_cond_broadcast(&can_consume);
    pthread_mutex_unlock(&mutex);
}
file->close();
pthread_cond_broadcast(&can_consume);

pthread_mutex_lock(&lock_buffer);
    buffer_current_size->store(-1); //END OF READ SIGNAL
pthread_mutex_unlock(&lock_buffer);
pthread_exit(NULL);
}

BUFFER CONTROLLER AND WORKER THREADS CALLER:

void *main_consumer(void *threadconsumer) //consumer caller and buffer controll
{  
//some declarations...
for(int j=0; j<NUMTHREADS; j++)
{  
    pthread_create(&threads[j],&attr,worker,(void *) &workerargs[j]);
}

//BUFFER CONTROLLER
pthread_mutex_lock(&lock_buffer);

while(*buffer_current_size!=-1){ //WHILE READ HASN'T ENDED

    pthread_mutex_unlock(&lock_buffer); //UNLOCK AND LOCK AGAIN TO LET OTHER THREADS HOLD THE LOCK FOR A WHILE
    pthread_mutex_lock(&lock_buffer);

    it=buffer->begin(); //GET FIRST ELEMENT OF THE BUFFER
    if(it->cnt == NUMTHREADS){
        buffer->pop_front(); //DELETE FIRST ELEMENT
        (*buffer_current_size)--; //DECREASE SIZE

        pthread_cond_signal(&can_produce); //PRODUCER CAN PRODUCE
    }
}
pthread_mutex_unlock(&lock_buffer);
for(int i=0; i<NUMTHREADS; i++)
{
    pthread_join(threads[i],NULL);
}
}

WORKER:

void *worker(void *threadwoker)
{
//some declarations...
pthread_mutex_lock(&lock_buffer); //LOCK TO GET BEGIN
it=buffer->begin();
while(!(*buffer_current_size==-1 && it==args->buffer->end())) {
    pthread_mutex_unlock(&lock_buffer);
    //insert into matrix...

    pthread_mutex_lock(&lock_buffer); //UNIFIED LOCK FOR IT AND CNT, SOLVING ISSUE
        (it->cnt)++;
        it++;
    pthread_mutex_unlock(&lock_buffer);

    pthread_mutex_lock(&mutex);
    while (*buffer_current_size==0) { //WAIT IF BUFFER EMPTY

        pthread_cond_wait(&can_consume, &mutex);
    }
    pthread_mutex_unlock(&mutex);


    pthread_mutex_lock(&lock_buffer); //LOCKING FOR WHILE ARGUMENTS
}
pthread_mutex_unlock(&lock_buffer);
pthread_exit(NULL);
}

As you can see, I used an int counter on each element of the buffer to check if all the worker threads have already read it. When this condition becomes true, the buffer controller erases the first element from the queue. All bounded by locks, to guarantee data integrity.

The problem is, this code doesn't work, I either get a seg fault or mutex error. Can anyone enlighten with any ideas why?

  • Try with a simpler version without the limit on buffer size (get rit of can_produce). Once that works you can add that section – marom Aug 03 '15 at 05:54
  • Without the can_produce condition variable it works just fine on a test database. But, since I'm dealing with big amounts of data (3.5gb plain text files), I can't have an unlimited buffer. – Victor Jorge Aug 03 '15 at 07:26

1 Answers1

1

Firstly, it is not clear exactly what data structures are being protected by each mutex. I suggest that for the initial implementation at least, you simplify down to one mutex protecting all of the shared state - that is the buffer size counter, the buffer itself, and the counter in the work items.

As for specific issues:

  • the Producer should re-test the condition after pthread_cond_wait() (it should be a while () loop rather than an if () statement);
  • when the Producer finishes, it accesses *buffer_current_size without a lock held and doesn't signal any waiting consumers;
  • the Buffer Controller accesses *buffer_current_size without a lock held;
  • the Worker accesses buffer->begin() and buffer->end() without a lock held;
  • the Worker accesses *buffer_current_size without a lock held;
  • the Worker calls pthread_mutex_trylock(&mutex) without checking the result, which means it could access shared state and unlock the mutex without having it locked;
  • the Worker needs to re-check the condition it's waiting for after calling pthread_cond_wait();
  • the Worker accesses the iterator it without a lock held, which is problematic because of other threads modifying the underlying ::list- since this thread has already incremented the counter, the Buffer Controller could have already deleted the item that the iterator points to, which means you can't increment the iterator.
caf
  • 233,326
  • 40
  • 323
  • 462
  • Ty for your detailed answer, it helped me a lot! I exchanged the if's for while loops, and implemented buffer_current_size as an atomic_int, so that I don't need a lock to access it. Also, I added a lock in all buffer operations. About the it on the worker, since it's a FIFO ::list, and I only increment the iterator, I think it's impossible to get to an erased element. But my code still doesn't work.. =/ – Victor Jorge Aug 03 '15 at 22:35
  • @VictorJorge: Atomic accesses aren't appropriate for the buffer size, since you want to signal changes to it with a condition variable. It's not just modifying the variable itself that needs to be atomic - the sequence *"check if buffer_size is what I need, if not wait on the condition variable"* needs to be atomic with respect to modifications to the variable. Otherwise, you can miss wakeups. – caf Aug 04 '15 at 00:47
  • @VictorJorge: The way you end up at an erased element is if the element gets erased *while `it` refers to it*, in between incrementing `it->cnt` and advancing the iterator with `it++`. – caf Aug 04 '15 at 00:49
  • I updated the code from the initial post to a newer version with the enhancements you suggested. And I've found out that the buffer->pop_front() is what is leading my code to a seg fault, but I still don't know why. When I get rid of that line, the code works pefectly fine. If you don't mind taking another look, I'd appreciate it. – Victor Jorge Aug 04 '15 at 07:34
  • @VictorJorge: You should still simplify down to one mutex - at the moment `mutex` protects `buffer_current_size` but the Buffer Controller doesn't lock that mutex before modifying that variable. What stops the Buffer Controller running off the end of the list if the Producer doesn't set `*buffer_current_size` to `-1` quickly enough? – caf Aug 05 '15 at 12:03