When it comes to implementing CAS Loop using std::atomic
, cppreference in this link gives the following example for push
:
template<typename T>
class stack
{
std::atomic<node<T>*> head;
public:
void push(const T& data)
{
node<T>* new_node = new node<T>(data);
new_node->next = head.load(std::memory_order_relaxed);
while(!head.compare_exchange_weak(new_node->next, new_node,
std::memory_order_release,
std::memory_order_relaxed /* Eh? */));
}
};
Now, I don't understand how come std::memory_order_relaxed
is used for the failure case, because as far as I understand, compare_exchange_weak
(same for -strong but I'll just use the weak version for convenience) is a load operation at failure, which means it loads from a successful CAS operation in another thread with std::memory_order_release
, and thus it should use std::memory_order_acquire
to be synchronized-with instead...?
while(!head.compare_exchange_weak(new_node->next, new_node,
std::memory_order_release,
std::memory_order_acquire /* There you go! */));
What if, hypothetically, the 'relaxed load' gets one of the old values, ending up failing again and again, staying in the loop for extra time?
The following scratchy picture is where my brain is stuck at.
Shouldn't a store from T2 be visible at T1? (by having synchronized-with relation with each other)
So to sum up my question,
- Why not
std::memory_order_acquire
, instead ofstd::memory_order_relaxed
at failure? - What makes
std::memory_order_relaxed
sufficient? - Does
std::memory_order_relaxed
at failure mean (potentially) more looping? - Likewise, does
std::memory_order_acquire
at failure mean (potentially) less looping? (besides the downside of the performance)