Optimistic locking strategy with atomics in C++ and ordering

Question

After reading up on c++0x's atomics and in combination with non locking queues I decided to have a go at playing with them.

The idea was to write a single producer, multiple consumer queue with optimistic locking. The messages do not need to be consumed. Skipping is perfectly fine as long as that if a consumer reads it reads the last version or knows that it's read was bad.

In the code below the strategy I had in mind fails. The data gets corrupted because data is written out-of-order. Any pointers on why this is, and how to fix it would be greatly appreciated.

Compilation on Linux: g++ -std=c++0x -o code code.cpp -lpthread

Thanks, Dennis

//
// This features 2 threads in which the first writes to a structure
// and the second tries to read from that with an optimistic
// locking strategy. The data is equal to the versioning so we can 
// see if the data is corrupt or not.
//
// @since: 2011-10-28
// @author: Dennis Fleurbaaij <mail@dennisfleurbaaij.com>
//

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdatomic.h>
#include <sched.h>
#include <assert.h>
#include <iostream>
#include <xmmintrin.h>

struct structure_t
{
    std::atomic<unsigned int> id;
    unsigned int data_a;
    unsigned int data_b;

    char _pad[ 64 - 12 ];
};

#define NUM_STRUCTURES 2
struct structure_t structures[NUM_STRUCTURES];

std::atomic<size_t> current_version_index;

volatile bool start = false;
volatile bool run = true;
size_t const iter_count = 10000000;

/**
 * Write thread
 */
void* writer(void*)
{
    while(!start)
        sched_yield();

    size_t i;
    for( i=0 ; i<iter_count ; ++i )
    {
        size_t index = current_version_index.load(std::memory_order_relaxed);
        size_t next_index = ( current_version_index + 1 ) & NUM_STRUCTURES-1;

        structures[next_index].data_a = i;
        structures[next_index].data_b = i;

        structures[next_index].id.store(i, std::memory_order_release);

        current_version_index.store(next_index);

        //std::cout << "Queued - id: " << i << ", index: " << next_index << std::endl;
        //sleep(1);
    }

    run=false;
}

/**
 * Read thread
 */
void* reader(void*)
{
    while(!start)
        sched_yield();

    unsigned int prev_id=0;

    size_t i;
    while(run)
    {
        size_t index = current_version_index.load(std::memory_order_relaxed);
        unsigned int id = structures[index].id.load(std::memory_order_acquire);

        if( id > prev_id )
        {
            unsigned int data_a = structures[index].data_a;
            unsigned int data_b = structures[index].data_b;

            // Re-read the data and check optimistic lock. This should be read after 
            // the lines above and should not be optimized away.
            //
            // This is what fails after a while:
            // Error in data. Index: 0, id: 24097, id2: 24097, data_a: 24099, data_b: 24099
            unsigned int id2 = structures[index].id.load(std::memory_order_acquire);
            if( id2 > id )
            {
                continue;
            }

            if( id != id2 ||
                id != data_a ||
                id != data_b )
            {
                std::cerr << "Error in data. Index: " << index << ", id: " << id 
                          << ", id2: " << id2 << ", data_a: " << data_a << ", data_b: " << data_b << std::endl;

                exit(EXIT_FAILURE);
            }

            //std::cout << "Read. Index: " << index << ", id: " << id 
            //              << ", data_a: " << data_a << ", data_b: " << data_b << std::endl;

            prev_id = id;
        }

        _mm_pause();
    }
}

/**
 * Main
 */
int main (int argc, char *argv[])
{
    assert( sizeof(structure_t) == 64 );

    pthread_t write_thread, read_thread;
    pthread_create(&write_thread, NULL, writer, (void*)NULL);
    pthread_create(&read_thread, NULL, reader, (void*)NULL);

    sleep(1);

    start = 1;

    void *status;
    pthread_join(read_thread, &status);
    pthread_join(write_thread, &status);
}

I am not sure that this does not belong to codereview rather than here. At any rate, you should state in plain english the strategy that you are aiming to implement, so that it can be compared that with what is implemented. — David Rodríguez - dribeas, Oct 28 '11 at 08:43
I hoped I made that clear with the text preceeding it. Can you tell me what you're looking for in terms of design besides what I wrote? "The idea was to write a single producer, multiple consumer queue with optimistic locking." Maybe an addition is in place. The messages do not need to be consumed. Skipping is perfectly fine as long as that if a consumer reads it reads the last version or knows that it's read was bad. — DennisFleurbaaij, Oct 28 '11 at 08:58
I'm not up to speed with c++11 atomics yet; possibly relevant questions: the padding of the `s truct` suggests that you want the struct to be aligned? How are you making sure that it is aligned? Does it matter for correctness? — sehe, Oct 28 '11 at 09:16
@sehe: That looks like trying to get the object to take exactly one cache-line to avoid *false-sharing*, and you are right in that he should use `alignas` to actually enforce it (i.e. as it is, it guarantees that `structures[0].id` and `structures[1].id` are in different cache lines, but not that `structures[0].data_b` and `structures[1].id` are in different cache lines, possibly causing *false-sharing* — David Rodríguez - dribeas, Oct 28 '11 at 10:41
@sehe&david: Indeed it's there to cache-align it to prevent false sharing. It could have been removed in the example. Thanks for the tip on alignas, didn't see that mentioned anywhere else when talking about false sharing (and I admit I was wondering how that exactly worked). — DennisFleurbaaij, Oct 28 '11 at 11:20

score 0 · Accepted Answer · answered Oct 28 '11 at 14:14

0

Perhaps, this is a bug:

    structures[next_index].data_a = i;
    structures[next_index].data_b = i;

// **** The writer may be interrupted (preempted) here for a long time ***
// at the same time the reader reads new data but old id (number of reads doesn't matter)

    structures[next_index].id.store(i, std::memory_order_release); // too late!

(current_version_index may be taken by reader from previous steps, so race condition is actually possible)

answered Oct 28 '11 at 14:14

user396672

3,106
1
21
31

You're absolutely right! I thought of this as well but thought that the result I was getting was the inverse of what I saw (so higher id, but that's obviously wrong). Thanks again! – DennisFleurbaaij Oct 28 '11 at 19:17

Optimistic locking strategy with atomics in C++ and ordering

1 Answers1