I'm using an ARM Cortex-M7 microcontroller (specifically the STM32F767ZG) to communicate with external devices using 4 USARTs (configured as asynchronous transmitters/receivers, and using DMA to handle transfers). While testing the (bare metal) code, I noticed an issue with data corruption, possibly relating to the way ARM and/or the compiler deals with variables in cache and RAM. See the following test code:
volatile char buffer[3];
// USART & DMA initialization code
// ...
buffer[0] = 0x11; //
buffer[1] = 0x22; // Buffer initial values
buffer[2] = 0x33; //
// Some other code
// ...
buffer[0] = 0xAA; //
buffer[1] = 0xBB; // Buffer updated values
buffer[2] = 0xCC; //
// DMA stream starts here
// ...
Executing the above code, the data that comes out of the USART is the following:
0x11 (OLD value of buffer[0])
0x22 (OLD value of buffer[1])
0xCC (NEW value of buffer[2])
I suspect this is relating to how ARM and/or the compiler deals with variables and their storage in cache and RAM. It seems that the contents of buffer[]
take some time to reach the actual RAM, and, as a result, DMA picks up the old values. Note that, for the first two bytes, the USART Tx register is immediately free (due to USART's internal buffering), so the first two bytes (buffer[0]
and buffer[1]
) are read almost instantly by DMA. For the third byte, there is a 1-byte transmission delay (which, at 9600 bps is just over 1 ms), so in this case the MCU has plenty of time to update the RAM, hence the new value of buffer[2]
is read by DMA.
This can be eliminated by simply adding a very small delay of just 1 microsecond before starting the DMA stream, like this:
...
Delay_us(1);
// DMA stream starts here
// ...
In this case, the USART sends the following (expected) data:
0xAA (NEW value of buffer[0])
0xBB (NEW value of buffer[1])
0xCC (NEW value of buffer[2])
In fact, the above delay can be fine-tuned (in the nanosecond range), so that only the first byte is old, and the next two bytes are new (i.e., USART sends 0x11, 0xBB, 0xCC
).
My question is, how can I be absolutely sure that the actual RAM contents (to be read by DMA) reflect the buffer values I set in code? Adding a delay before initiating the DMA stream seems like a very crude and uncertain solution. Is there a definite way (a technique in C, or even an Assembly command) to flush the MCU cache and transfer its contents to RAM, so that there is no corruption in the buffer data in RAM?