5

The current C++0x draft states on section 29.3.9 and 29.3.10, pages 1111-1112 that in the following example:

// Thread 1
r1 = y.load(memory_order_relaxed);
x.store(1, memory_order_relaxed);

// Thread 2
r2 = x.load(memory_order_relaxed);
y.store(1, memory_order_relaxed);

The outcome r1 = r2 = 1 is possible since the operations of each thread are relaxed and to unrelated addresses. Now my question is about the possible outcomes of the following (similar) example:

// Thread 1
r1 = y.load(memory_order_acquire);
x.store(1, memory_order_release);

// Thread 2
r2 = x.load(memory_order_acquire);
y.store(1, memory_order_release);

I think that in this case the outcome r1 = r2 = 1 is not possible. If it was possible, the load of y would synchronize-with (thus happen-before) the store to y. Similar to x, the load of x would happen-before the store to x. But the load of y is sequenced before (thus also happens-before) the store to x. This creates a cyclic happens-before relation which I think is not allowed.

janneb
  • 36,249
  • 2
  • 81
  • 97
Giovanni Funchal
  • 8,934
  • 13
  • 61
  • 110
  • I changed the title, as the issue per se has nothing to do with speculative stores. For speculative stores, see http://stackoverflow.com/questions/2001913/c0x-memory-model-and-speculative-loads-stores – janneb May 26 '10 at 12:21
  • Store speculation is the keyword here because the outcome `r1=r2=1` requires the stores to be reordered ("speculated") before both reads. Your title is too vague. – Giovanni Funchal May 26 '10 at 12:43
  • Speculative store in the context of the C++0x working papers refers to compiler speculation, see the question I linked to in my previous comment. Your question has to do with reordering that the hardware does (depending on the shared memory consistency model the hardware architecture implements), and how C++0x provides facilities to constrain this memory re-ordering by issuing various memory barrier instructions. Thus I feel that the title I provided is more appropriate than the original one; but hey, it's your question so feel free to change it to whatever you wish. – janneb May 26 '10 at 12:53

1 Answers1

4

If we take time (or, instruction sequences if you like) to flow downward, just like reading code, then my understanding is that

  • An acquire fence allows other memory accesses to move downwards past the fence, but not upwards past the fence
  • A release fence allows other memory accesses to move upwards past the fence, but not downwards past the fence

In other words, if you have code like

acquire
// other stuff
release

then memory accesses may move from outside the acquire/release pair to the inside, but not the other way around (and they may not skip the acquire/release pair completely either).

With the relaxed consistency semantics in your first example in the question, the hardware can reorder memory accesses such that the stores enter the memory system before the loads, thus allowing r1=r2=1. With the acquire/release semantics in the second example, that reordering is prevented, and thus r1=r2=1 is not possible.

janneb
  • 36,249
  • 2
  • 81
  • 97
  • I'm not sure I understand your answer. – Giovanni Funchal May 26 '10 at 13:04
  • Hmm, does my clarification help? If not, what specifically do you not understand? – janneb May 26 '10 at 13:56
  • Take my second example, replace releases by relaxeds, acquires stays acquires. Is r1=r2=1 possible? Now restart from the initial version of second example again, but this time replace acquires by relaxed and keep releases as is. Is r1=r2=1 possible? – Giovanni Funchal May 26 '10 at 14:10
  • Hmm, I think in the second example, only one of the operations for each thread needs to be restricted in order to prevent r1=r2=1. In a more complicated example you probably want both, because you have some stuff between the acquire/release pair that is not allowed to "leak" out. More generally however, anything beyond the default sequential consistency semantics is best left to experts who are developing high-performance locking mechanisms or lock-free algorithms, that can then be used by mere mortals. – janneb May 26 '10 at 18:19
  • I'm not a mere mortal :-) Thanks for the clarifications. – Giovanni Funchal May 27 '10 at 07:45