What's the point of std::atomic for types that the CPU can't atomically manipulate?

Question

The whole point of using std::atomic and not mutexes is to get:

higher performance for multithread code (no contention between readers);
smaller change of performance when heavy contention occurs (a retry on failed RMW is less drastic than losing the rest of time slice because a thread holding the mutex is ready to run but not running);
the ability to communicate with signal handlers.

When the atomicity of the operation is "emulated" with a table of mutexes:

The performance will at best be as good as a user mutex, for the case where exactly one modifying operation is needed; when multiple operations are used in sequence, multiple locking/unlocking operation will need to occur, making the code less efficient.
Performance will be no more predictable than with an explicit user mutex.
Such "emulated" atomicity cannot be used with code that blocks other code (e.g. a signal handler).

So why was such poor emulation of atomic CPU operations found worthwhile? What's the use case of the non-lock-free fallback mechanism in std::atomic?

Re, "The whole point of `std::atomic`..." That's not the point at all. `std::atomic` (or std::anythingelse, for that matter) does not enable you to do achieve anything that you couldn't have achieved long before it was invented. The whole point though is that now, there's a _standard way_ to get the behavior you need without having to know the details of any particular platform or operating system. You say, _the_ CPU, but which CPU are you talking about? C++ is implemented on _many_ different CPUs, and what one CPU can do "atomically" may need to be implemented with locks on a different CPU. — Solomon Slow, Dec 15 '19 at 02:40
@SolomonSlow So the point of `std::atomic` is to support... exactly what I wrote? — curiousguy, Dec 15 '19 at 02:44
*"emulated" with* a *global mutex:* Real implementations use a hash-table of spinlocks, not a *single* mutex like your phrasing might imply. [Where is the lock for a std::atomic?](//stackoverflow.com/q/50298358). But yes, read performance doesn't scale with multiple readers of the *same* object, only of separate objects. And good point; it's not safe to use in a signal handler so there are qualitative differences. — Peter Cordes, Dec 15 '19 at 05:25

David Schwartz · Answer 1 · 2020-08-23T19:49:33.717

15

Sometimes you have to write code that can work on multiple platforms and atomic operations might be supported without locks on some platforms and it might not be on others. Using std::atomic gives you the best of both worlds -- optimum performance where the platform can support it and sane behavior where the platform can't. A side benefit is cleaner semantics and less risk of inadvertently holding the lock for more or less time than intended.

edited Aug 23 '20 at 19:49

answered Dec 15 '19 at 01:07

David Schwartz

179,497
17
214
278

@PeterCordes: There are many tasks that will be possible on implementations that are aware of and uses an execution environment's supported atomic operations, but would be unachievable using any kind of "emulation". Among other things, if an environment's ABI specifies a convention for performing atomic 64-bit updates and two C implementations both follow it, operations performed in code processed by one implementation will be atomic with respect to those performed in code processed by the other. If one or both implementation "emulates" atomic operations, however, that won't work. – supercat Nov 01 '21 at 22:14

What's the point of std::atomic for types that the CPU can't atomically manipulate?

1 Answers1