Well, because the buffer elements are read/written by different threads, and you don't always have a happens-before relation between a write and a subsequent read. So if the buffer elements were not atomic, you would have a data race.
In case you are interested, you can take a look at my implementation of the chase-lev-deque: https://github.com/mpoeter/xenium/blob/master/xenium/chase_work_stealing_deque.hpp
Update
The problem is that the indexes can wrap around. Suppose a thread calling steal might get suspended after reading top
and bottom
, but before it can read the item from the buffer. If in the meantime the indexes wrap around, the item could be overwritten by some push operation. Therefore the load operation in steal would not have a happens-before relation with that store.
The standard defines a data race as follows:
The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens-before the other.
Since the described example does not provide happens-before relation between the read and write operations on the buffer, this would be a data race if the buffer is not atomic. However, atomics can never participate in a data race, so by making buffer atomic, we completely prevent any data races caused by unsynchronized access (even if such operations are relaxed).
Note that this is only necessary to prevent data races for operations on buffer. The actual synchronization between a push and a steal operation happens via the operations on bottom.