How to use std::atomic<> effectively for non-primitive types?

Question

The definitions for std::atomic<> seem to show its obvious usefulness for primitive or perhaps POD-types.

When would you actually use it for classes?

When should you avoid using it for classes?

bames53 · Accepted Answer · 2015-11-03T16:43:15.177

35

The operations std::atomic makes available on any trivially copyable type are pretty basic. You can construct and destroy atomic<T>, you can ask if the type is_lock_free(), you can load and store copies of T, and you can exchange values of T in various ways. If that's sufficient for your purpose then you might be better off doing that than holding an explicit lock.

If those operations aren't sufficient, if for example you need to atomically perform a sequence operations directly on the value, or if the object is large enough that copying is expensive, then instead you would probably want to hold an explicit lock which you manage to achieve your more complex goals or avoid doing all the copies that using atomic<T> would involve.

// non-POD type that maintains an invariant a==b without any care for
// thread safety.
struct T { int b; }
struct S : private T {
    S(int n) : a{n}, b{n} {}
    void increment() { a++; b++; }
private:
    int a;
};

std::atomic<S> a{{5}}; // global variable

// how a thread might update the global variable without losing any
// other thread's updates.
S s = a.load();
S new_s;
do {
    new_s = s;
    new_s.increment(); // whatever modifications you want
} while (!a.compare_exchange_strong(s, new_s));

As you can see, this basically gets a copy of the value, modifies the copy, then tries to copy the modified value back, repeating as necessary. The modifications you make to the copy can be as complex as you like, not simply limited to single member functions.

edited Nov 03 '15 at 16:43

answered Dec 14 '12 at 21:22

bames53

86,085
15
179
244

1

+1 for a specific use case - so making S atomic is effectively like putting mutex locks on all methods of S? const and non-const methods? – kfmfe04 Dec 16 '12 at 18:22
4

@kfmfe04: You need to call a.load() to get your S, and after that you are unguarded and each method call is not guarded. All you're getting is load/store into 'a'. – VoidStar Dec 18 '14 at 22:38
2

@kfmfe04 It's not like a mutex on each individual method. For example you can call multiple methods and apply the results as a single atomic transaction. What's going on is that you get a local, non-shared copy, you modify the local copy however you like, and then you attempt copy the modified data back into the shared variable. – bames53 Dec 19 '14 at 06:13
@pocketbroadcast had the following comment: "I think this example is not doing what you intended! It possibly increments outdated states of the s.a and s.b since no load is done during the loop. Thus two independent threads incrementing on global std::atomic a{{5}}; (lets call them a1 for thread 1 and a2 for thread 2) could possibly lead to a1 == 6, and an endless loop in t2, though the intended behaviour would be a == 7 after both threads executed the loop." – bames53 Nov 03 '15 at 15:34
2

The loop _does_ load the value: `compare_exchange_strong` either succeeds in updating the atomic, or it replaces the value in the 'expected' argument with the newly observed value so that you know what to expect next time around. So if the exchange fails, the new value is loaded into `s`, then the loop copies the new value, makes its change again which may have a different result from the previous iteration, and attempts to store the new value. – bames53 Nov 03 '15 at 15:39
1

@pocketbroadcast Thanks for the `!` correction. The `memcpy` requirement is already referred to (in the first sentence as '[trivially copiable](http://en.cppreference.com/w/cpp/types/is_trivially_copyable)'). I think it's better to leave discussion of the specifics of `compare_exchange` to more complete documentation, since there's more to it than just the load. For example [here's](http://stackoverflow.com/questions/21879331/is-stdatomic-compare-exchange-weak-thread-unsafe-by-design) some discussion of an issue with `compare_exchange` that even some experts took time to fully grasp. – bames53 Nov 03 '15 at 17:12

Pete Becker · Answer 2 · 2012-12-14T20:46:45.217

14

It works for primitive and POD types. The type must be memcpy-able, so more general classes are out.

edited Dec 14 '12 at 20:46

answered Dec 14 '12 at 20:17

Pete Becker

74,985
8
76
165

13

All that's required seems to be that the type is trivially copyable. POD types are stricter than that, so many non-POD types can be used with `atomic`. – bames53 Dec 14 '12 at 20:41
1

The copy constructor of your type must be `noexcept` because the `std::atomic` constructor given an initial value is `noexcept` but is passed the initial value by-value. – Raedwald Dec 05 '17 at 23:01

Johan Lundberg · Answer 3 · 2012-12-14T20:49:54.173

7

The standard say that

Specializations and instantiations of the atomic template shall have a deleted copy constructor, a deleted copy assignment operator, and a constexpr value constructor.

If that is strictly the same as the answer by Pete Becker, I'm not sure. I interpret this such that you are free to specialize on your own class (not only memcpy-able classes).

edited Dec 14 '12 at 20:49

answered Dec 14 '12 at 20:34

Johan Lundberg

26,184
12
71
97

2

I think you meant to quote paragraph 1 instead: "There is a generic class template `atomic`. The type of the template argument T shall be trivially copyable (3.9)," because paragraph 3, the paragraph you quote, doesn't say the same thing or even specify any requirements on the types you may use with the generic class template `atomic`. – bames53 Dec 14 '12 at 20:38
@bames53 actually, the question is unclear if it's about the generic `atomic` class or the interface it provides. You could still re-use the interface by providing your own specialization. – KillianDS Dec 14 '12 at 20:44
Hm. No, I did not copy the wrong paragraph. I didn't interpret that first paragraph as clearly being demanded also for specializations. – Johan Lundberg Dec 14 '12 at 20:44
@JohanLundberg - no, it's not the same. That's a constraint on implementations, not on the types that the template is instantiated with. – Pete Becker Dec 14 '12 at 20:48
@PeteBecker, as you claiming that I can not specialize on a non-memcpyable class? – Johan Lundberg Dec 14 '12 at 20:51
2

The implementation of `atomic` in the standard library requires that `T` be `memcpy`able; that's how `std::atomic` copies values. The reason for that is to avoid calling out into user code through an assignment operator, since that could lead to deadlock. – Pete Becker Dec 14 '12 at 20:54
What do you mean by 'in the standard library'? Is that a yes on my question then? If I specialized it would not be using the version from the standard library anymore... – Johan Lundberg Dec 14 '12 at 20:57

pocketbroadcast · Answer 4 · 2015-11-03T16:03:33.267

I'd prefer std::mutex for this kind of scenarios. Nevertheless I've tried a poor mans benchmark to profile a version with std::atomics and std::mutex in a single threaded (and thus perfectly sync) environment.

#include <chrono>
#include <atomic>
#include <mutex>

std::mutex _mux;
int i = 0;
int j = 0;
void a() {
    std::lock_guard<std::mutex> lock(_mux);
    i++;
    j++;
}

struct S {
    int k = 0;
    int l = 0;

    void doSomething() {
        k++;
        l++;
    }
};

std::atomic<S> s;
void b() {
    S tmp = s.load();
    S new_s;
    do {
        new_s = tmp;
        //new_s.doSomething(); // whatever modifications you want
        new_s.k++;
        new_s.l++;
    } while (!s.compare_exchange_strong(tmp, new_s));
}

void main(void) {

    std::chrono::high_resolution_clock clock;

    auto t1 = clock.now();
    for (int cnt = 0; cnt < 1000000; cnt++)
        a();
    auto diff1 = clock.now() - t1;

    auto t2 = clock.now();
    for (int cnt = 0; cnt < 1000000; cnt++)
        b();
    auto diff2 = clock.now() - t2;

    auto total = diff1.count() + diff2.count();
    auto frac1 = (double)diff1.count() / total;
    auto frac2 = (double)diff2.count() / total;
}

on my system the version using std::mutex was faster than the std::atomic approach. I think this is caused by the additional copying of the values. Further, if used in a multithreaded environment, the the busy looping can affect performance too.

Summing up, yes it is possible to use std::atomic with various pod types, but in most cases std::mutex is the weapon of choice, as it is intentionally easier to understand what is going on, and therefore is not as prone to bugs as the version presented with the std::atomic.

atomic gives you memory ordering though: http://www.cplusplus.com/reference/atomic/memory_order/ — Andrew, Feb 18 '17 at 04:50

How to use std::atomic<> effectively for non-primitive types?

4 Answers4

Linked