Simple C mutual exclusion implementation in a dual core AMP system

Question

I am working with the Zynq-7000 SoC - developing a dual core (CPU0, CPU1) application. I want to use the shared on-chip memory (OCM) with cache disabled for bidirectional data exchanging between the cores. My idea is to set the data sharing in the following manner:

typedef struct
{
  uint8_t mstatus;
  uint8_t[10] mdata;
} mailbox;

mailbox core0mbox;
mailbox core1mbox;

Structure mailbox holds a buffer for storing the data (mdata) and its status (mstatus). The status may be equal to 0 or 1 (in general - a zero value indicates that the data has been processed by the receiver and new data may be written into the buffer; a non-zero value indicates that the data has not been processed by the receiver yet). There are two mailboxes - core0mbox (stores the data received by core 0 from core 1) and core1mbox (stores the data received by core 1 from core 0), both stored in the OCM.

When the core 0 wants to send the data, it polls the status flag core1mbox.mstatus

if it is equal to zero, the core fills the buffer associated with core1mbox with data and then sets the flag associated with core1mbox to 1
if it has a non-zero value, the core cannot send the data

When the core 1 wants to send the data, it polls the status flag core0mbox.mstatus

if it is equal to zero, the core fills the buffer associated with core0mbox with data and then sets the flag associated with core0mbox to 1
if it has a non-zero value, the core cannot send the data

The core 0 periodically polls the core0mbox.mstatus - if it has a non-zero value, then the core 0 processes the data and after finishing it sets core0mbox.mstatus to 0.

The core 1 periodically polls the core1mbox.mstatus - if it has a non-zero value, then the core 1 processes the data and after finishing it sets core1mbox.mstatus to 0.

My question is - could this scheme lead to an undetermined behavior of the system (e.g. data corruption) due to problems caused by the concurrent accesses? I know that this scheme could not work if the status flag could have more values (problems due to non-atomic write/read operation) or if there would be a larger number of cores in the system, but it seems to work well for the situation described.

It depends on if operations on `uint8_t` are atomic. Usually on `int` they are, but I don't know about `uint8_t` on your system. — yyny, Nov 16 '16 at 15:32
I believe that this implementation would work even if the operations on `uint8_t` weren't atomic (because we only care if the status is equal to zero or it is not). — Steven, Nov 16 '16 at 16:40
Keep in mind that modern hardware is allowed to re-order memory accesses under circumstances... How do you ensure that the other core does not see `mstatus == 1` until after all the data has been written? You need at least some memory barriers, if not a full mutex approach. — twalberg, Nov 16 '16 at 16:55
@Steven It's not that simple. Some processors might write the variable bit by bit, clear it before writing, store garbage in it temporarily, use bitshifts, logic operators, etc. Furthermore, your compiler (and sometimes your processor, as @twalberg noted) may reorder memory accesses to increase performance, however this can be avoided using a simple `volatile` on the `mstatus`. However, as I explained, This is useless if operations on `uint8_t` aren't atomic. — yyny, Nov 18 '16 at 17:02
@Steven Now, on most processors common operations on `uint8_t` (At least read and write) _are_ atomic, and your code would work as expected (given you define `mstatus` `volatile`), so you should be alright. Otherwise, you'll have to use C11, or if that's not possible something platform specific, like linux's `futex` or GCC's `__atomic`. — yyny, Nov 18 '16 at 17:12

score 0 · Answer 1 · answered Jul 23 '17 at 16:09

Mutexes are not only about control of the data that is directly affected but also about guarantees that the threads mutually see other data consistently. C before C11 does not have the definitions and tools to even speak about this, and you would be completely dependent on architecture specific behavior.

What could happen here in your setting, e.g, is that by some architecture bias, changes to the status field may be visible to the other thread, where changes to the mdata fields might not be yet transferred (think of cache hierarchies, word alignment ...).

So, no, you shouldn't do this, unless you know exactly would data consistency model is implemented for your platform.

And the minimum requirement to make this work on any platform is to qualify your variables as volatile, because otherwise an optimizing compiler might pretend that no changes to any data can occur from outside the current function.

Simple C mutual exclusion implementation in a dual core AMP system

1 Answers1