So I am trying to implement writer and reader methods in such a way where I always want the reader to read the value that was last written by the writer, and thus have reads ordered after writes. For this specific example, I am trying to use a very simple fixed size queue, that has two pointers, _enqPtr
and _deqPtr
. These are both std::atomic<size_t>
that represent the next index an element is enqueued or dequeued at. My full code I am using is below
class Queue {
public:
Queue(size_t size): _enqPtr(0),
_deqPtr(0),
_size(size),
_data(new std::atomic<long>[_size])
{
for (int i = 0; i < _size; ++i)
{
_data[i].store(0);
}
}
bool enqueue(long val) {
size_t place = _enqPtr.fetch_add(1, std::memory_order_relaxed);
if (place >= _size)
return false;
_data[place].store(val, std::memory_order_release); // DESIRED TO BE BEFORE
return true;
}
bool dequeue(long& val) {
size_t place = _deqPtr.fetch_add(1, std::memory_order_relaxed);
size_t enq = _enqPtr.load();
if (place >= _size)
return false;
if (place >= enq) {
// went too far
_deqPtr.store(enq - 1);
return false;
}
val = _data[place].load(std::memory_order_acquire); // DESIRED TO BE AFTER
assert(val != 0); // keeps catching this, despite acquire / release.
return true;
}
private:
std::atomic<size_t> _enqPtr;
std::atomic<size_t> _deqPtr;
size_t _size;
std::atomic<long>* _data;
};
The fetch_add
calls facilitate that every thread that calls the enqueue
or dequeue
gets it's own index. However, the issue i am running into is it seems that , one thread calls enqueue
and calls fetch_add
, but before it can store it's value, another thread calls dequeue
and completes the call, and thus reads a null or non-value (in this case zero).
The threading code i am using to test this is below:
class QueueTester {
public:
QueueTester(size_t queueSize, size_t threadCount):
_threadCount(threadCount),
_threads(new std::thread[_threadCount]),
_queue(queueSize)
{}
void run() {
for (int i = 0; i < _threadCount; ++i)
{
if (i % 2 == 0) {
_threads[i] = std::thread([this]{
long k = 1;
while (_queue.enqueue(k)) {
k = k == 0 ? 1 : k + 1;
}
});
} else {
_threads[i] = std::thread([this]{
long k = 1;
while (_queue.dequeue(k));
});
}
}
for (int i = 0; i < _threadCount; ++i)
{
_threads[i].join();
}
}
private:
size_t _threadCount;
std::thread* _threads;
Queue _queue;
};
What do i need to adjust in terms of memory ordering and fences such that val = _data[place].load(std::memory_order_acquire);
will always load after / from _data[place].store(val, std::memory_order_release);
?