I was looking at the implementation for some parts of the .NET TPL's "Dataflow" library out of curiosity and I came across the following snippet:
private void GetHeadTailPositions(out Segment head, out Segment tail,
out int headLow, out int tailHigh)
{
head = _head;
tail = _tail;
headLow = head.Low;
tailHigh = tail.High;
SpinWait spin = new SpinWait();
//we loop until the observed values are stable and sensible.
//This ensures that any update order by other methods can be tolerated.
while (
//if head and tail changed, retry
head != _head || tail != _tail
//if low and high pointers, retry
|| headLow != head.Low || tailHigh != tail.High
//if head jumps ahead of tail because of concurrent grow and dequeue, retry
|| head._index > tail._index)
{
spin.SpinOnce();
head = _head;
tail = _tail;
headLow = head.Low;
tailHigh = tail.High;
}
}
(Viewable here: https://github.com/dotnet/corefx/blob/master/src/System.Threading.Tasks.Dataflow/src/Internal/ConcurrentQueue.cs#L345)
From what I understand of thread-safety, this operation is prone to a data race. I am going to explain my understanding, and then what I perceive to be the 'error'. Of course, I expect it is more likely an error in my mental model than in the library, and I'm hoping someone here can point out where I'm going wrong.
...
All the given fields (head
, tail
, head.Low
and tail.High
) are volatile. In my understanding this gives two guarantees:
- Each time all four fields are read, they must be read in-order
- The compiler may not elide any of the reads, and the CLR/JIT must take measures to prevent 'caching' of the values
From what I read of the given method, the following happens:
- An initial read of the internal state of the
ConcurrentQueue
is performed (that ishead
,tail
,head.Low
andtail.High
). - A single busy-wait spin is performed
- The method then reads the internal state again and checks for any changes
- If the state has changed, go to step 2, and repeat
- Return the read state once it is considered 'stable'
Now assuming that is all correct, my "problem" is thus: The reading of state above is not atomic. I see nothing preventing a read of half-written state (e.g. a writer thread has updated head
but not yet tail
).
Now I'm somewhat aware that half-written state in a buffer like this isn't the end of the world- after all the head
and tail
pointers are completely OK to be updated/read independently, usually in CAS/spin loops.
But then I don't really see what the point of spinning once and then reading again is. Are you really going to 'catch' a change-in-progress in the time it takes to do a single spin? What is it trying to 'guard' against? Put in other words: If the whole read of state is meant to be atomic, I don't think the method does anything to help that, and if not, what exactly is the method doing?