2

I'm currently working on a project using the Zynq-7000 SoC. We have a custom DMA IP in PL to provide faster transactions between peripherals and main memory. The peripherals are generally serial devices such as UART. The data received by the serial device is transferred immediately to the main memory by DMA. What I try to do is to reach the data stored at a pre-determined location of the memory. Before reading the data, I invalidate the related cache lines using a function provided by xil_cache.h library as below.

Xil_DCacheInvalidateRange(INTPTR adr, u32 len);

The problem here is that this function flushes the related cache lines before invalidating them. Due to flushing, the stored data is overwritten. Hence, every time I fetch the corrupted bytes. The process has been explained in library documentation as below.

If the address to be invalidated is not cache-line aligned, the following choices are available:

  1. Invalidate the cache line when required and do not bother much for the side effects. Though it sounds good, it can result in hard-to-debug issues. The problem is, if some other variables are allocated in the same cache line and had been recently updated (in cache), the invalidation would result in loss of data.
  2. Flush the cache line first. This will ensure that if any other variable presents in the same cache line and updated recently are flushed out to memory. Then it can safely be invalidated. Again it sounds good, but this can result in issues. For example, when the invalidation happens in a typical ISR (after a DMA transfer has updated the memory), then flushing the cache line means, losing data that were updated recently before the ISR got invoked.

As you can guess that I cannot always allocate a memory region that has a cache-line aligned address. Therefore, I follow a different way to solve the problem so that I calculate the cache-line aligned address which is located in memory right before my buffer. Then I call the invalidation method with that address. Note that the Zynq's L2 Cache is an 8-way set-associative 512KB cache with a fixed 32-byte line size. This is why I mask the last 5 bits of the given memory address. (Check the section 3.4: L2 Cache in Zynq's documentation)

INTPTR invalidationStartAddress = INTPTR(uint32_t(dev2memBuffer) - (uint32_t(dev2memBuffer) & 0x1F));
Xil_DCacheInvalidateRange(invalidationStartAddress, BUFFER_LENGTH);

This way I can solve the problem but I'm not sure if I'm violating any of the resources that are placed before the resource allocated for DMA.(I would like to add that the referred resource is allocated at heap using the dynamic allocation operator new.) Is there a way to overcome this issue, or am I overthinking it? I believe that this problem could be solved better if there was a function to invalidate the related cache lines without flushing them.

EDIT: Invalidating resources that are not residing inside the allocated area violates the reliability of variables placed close to the referred resource. So, the first solution is not applicable. My second solution is to allocate a buffer that is 32-byte bigger than the required one and crop its unaligned part. But, this one also can cause the same problem as its last part*(parts = 32-byte blocks)* is not guaranteed to have 32 bytes. Hence, it might corrupt the resources placed next to it. The library documentation states that:

Whenever possible, the addresses must be cache-line aligned. Please note that not just the start address, even the end address must be cache-line aligned. If that is taken care of, this will always work.

SOLUTION: As I stated in the last edit, the only way to overcome the problem was to allocate a memory region with a Cache-Aligned address and length. I'm not able to determine the start address of the allocated area, hence I've decided to allocate a space that is two Cache-Blocks bigger than the requested one and crop the unaligned parts. The unalignment can occur at the first or the last block. In order not to violate the destruction of the resources, I saved the originally allocated address carefully and used the Cache-Aligned one in all of the operations.

I believe that there are better solutions to the problem and I keep the question open.

Caglayan DOKME
  • 948
  • 8
  • 21

1 Answers1

1

Your solution is correct. There is no way to flush a subset of a cache line.

Normally this behavior is transparent to programs but it becomes visible in multithreaded code and when sharing memory with hardware accelerators.

Jamey Hicks
  • 2,340
  • 1
  • 14
  • 20