c++11 atomic decrement of uint64_t cycles at 0

Question

I'm trying to program a simple parallel processing system which should generate N objects to put in a container in a multi-threaded environment.

To tell the threads when to stop generating the object I created a simple reverse counter that starts at N at run down at 0 with each thread decrementing it in parallel. The counter is supposed to be a uint64_t and I wanted to try the atomic support of C++11. The code looks like this

//Class member
std::atomic<uint_fast64_t> counter

//Parallel function
while(counter-- > 0)
{
do something
}

It compiles correctly and executes but it enters an infinite loop because once the counter reaches 0, it is further decremented but it jumps back to the highest integer available thus never stopping.

Changing the type to int64 instead of uint64 solves the issue but I would like to understand why do I need this workaround at all.

My working hypothesis at the moment is that the decrement is done anyway even when the condition is false so when the first thread check the counter at 0 it decrements it anyway and the operation subtraction operation doesn't really care about the encoding of the integers but performs a simple bitwise operation (I forgot which one exactly but I remember additions and subtractions are done by simple bitwise xor and shifts) which in the next iteration is interpreted as the max uint value. Do you think this explanation is reasonable?

Apart from switching from uint to int one option is to switch the operation from a decrement to an increment but can you think of a different algorithm to keep the decrement in this context?

Edit1

Another possible solution I thought of, even though not particularly elegant is, knowing how many threads are actually launched in parallel, to stop the counter at N_Threads with a start value of Tot+NThreads

//In calling function
counter = Tot+NThreads

//Parallel function
while(counter-- > NThreads)
{
do something
}

Of course it does that, you told it to. If counter is zero, it decrements it and quits. All other threads are now furiously decrementing the now extremely high counter with no hope of making serious headway. — harold, Mar 15 '15 at 18:55
@Moberg: Consider the case of two threads. Both enter the while loop because `counter == 1`. Then, once both are in the loop, they both decrement `counter`. And now `counter == -1`. — Bill Lynch, Mar 15 '15 at 19:53
As Bill Lynch said, if you do the variable evaluation and decrement in two different places you cannot guarantee there was noone else doing the same in the meantime — Triskeldeian, Mar 15 '15 at 21:07
`while(counter-- > NThreads)` is no more performant, and significantly less idiomatic, than `while(counter++ < NThreads)`. — Casey, Mar 16 '15 at 02:56

emvee · Answer 1 · 2015-03-15T19:07:39.263

3

The atomicity only guarantees that all threads see a consistent value of the atomic value. Typically an operation like -- is a read-modify-write operation. The atomic only guarantees that no other thread modifies the counter whilst another thread is busy modifying it.

To clarify: atomic prevents a data race, nothing else.

Suppose two threads, T1 and T2 and the following R, M, W sequence: now thread T2's result has been overwritten by T1's result, i.e. no consistent value of the counter.

T1:Read T2:Read T2:Modify T1:Modify T2:Write T1:Write

So, in your problem, the code executes counter-- which means that the -- will always be done, irrespective of what its value was. So if the value was already zero, it will now be -1 or, when using unsigned data types, the maximum value for the unsigned type.

edited Mar 15 '15 at 19:07

answered Mar 15 '15 at 18:57

emvee

4,371
23
23

`operator--` on an `std::atomic` is supposed to be atomic though, isn't it? The reference I'm reading says so anyway. That doesn't help though, suffers from exactly the same problem. – harold Mar 15 '15 at 19:04
But `atomic` is not the guarantee you need. You need to "only decrement when >0". With _just_ the `atomic` guarantee you can't do that. As explained. – emvee Mar 15 '15 at 19:06
Yea I see now, I read your answer wrong and thought you were saying that timeline was the problem, instead of the thing not-happening.. – harold Mar 15 '15 at 19:10
But that guarantee is actually enough, right? I mean, you could do `counter-- < very_large_value` (with the large value small enough that it can't be reached by the threads over-decrementing the counter) – harold Mar 15 '15 at 19:13
No, because the decrement is done _unconditionally_. Only after the decrement the value is inspected and the thread's decision to `do something` is made. – emvee Mar 15 '15 at 19:16
Doesn't matter that it's done unconditionally. It will still be more than `very_large_value`. – harold Mar 15 '15 at 19:17
The atomicity only promises that _one_ thread will see the "zero-after-decrement" and thus only _that_ thread will decide not to do anything. All other threads keep on going `doing something` because they see a non-zero value after their decrement has executed. – emvee Mar 15 '15 at 19:20
The other threads will see a value larger than `very_large_value` and quit. – harold Mar 15 '15 at 19:22
It would only postpone the problem because the problem is _in the construction of the code_: it doesn't matter if it was `> value` or `<= value` or the actual number in `value`. Doing it like this is just not multi-thread safe. – emvee Mar 15 '15 at 19:24
The only way to break that is by having billions of billions of threads, because each of them can only do one spurious decrement. That's not reasonable in real life. – harold Mar 15 '15 at 19:26
the `very_large_value` would work but it is a much cryptic syntax than just inverting the counting and go from 0 to the value of elements one wants to create – Triskeldeian Mar 15 '15 at 19:28
harold comment makes me thinking about another possible solution, which I included in the edited question – Triskeldeian Mar 15 '15 at 19:33

Bill Lynch · Answer 2 · 2015-03-15T21:56:45.093

Your Hypothesis

It's more or less correct. (unsigned) 0 - 1 is going to be the largest unsigned integer. And your atomic decrement will always occur, even if the condition is false.

How do we fix this?

I believe that you are looking for something like this actually:

std::atomic<uint_fast64_t> counter;
while (true) {
    uint_fast64_t cur = counter;
    if (cur == 0)
        break;
    if (counter.compare_exchange_strong(cur, cur - 1) == false)
        continue;

    ... // Perform work
}

First, we test if the current value of the counter is 0. If it is, we're done working and we should just quit.

If it was greater than 0, then we need to decrement the counter. And that's a compare and swap operation. So if the value hasn't changed in the amount of time it takes to get to that second atomic operation, we perform the decrement and then do some work. If we were preempted, then we'll just try again.

Yes, this is actually very interesting. I didn't noticed the compare function. I just started using the atomic header. Thanks. If I had enough reputation I would vote it up :) — Triskeldeian, Mar 15 '15 at 21:10
@Triskeldeian: And just in case you don't believe that this is valid, here's the (more or less) equivalent code in the linux kernel: http://lxr.free-electrons.com/source/include/linux/atomic.h#L97 — Bill Lynch, Mar 15 '15 at 21:31

c++11 atomic decrement of uint64_t cycles at 0

2 Answers2

Your Hypothesis

How do we fix this?