1

I'm confused by the implementation of the dcache_inval_poc (start, end) as follows: https://github.com/torvalds/linux/blob/v5.15/arch/arm64/mm/cache.S#L134. There is no sanity check for the "end" address, but what will happen if the range (start, end) passes from the upper layer, like dma_sync_single_for_cpu/dma_sync_single_for_device, beyond the L1 data cache size? eg: dcache_inval_poc(start, start+256KB), but L1 D-cache size is 32KB

After going through the source code of the dcache_inval_poc (start, end) https://github.com/torvalds/linux/blob/v5.15/arch/arm64/mm/cache.S#L152 , I tried to convert the loop code to Pseudo-Code in C as the following:

x0_kaddr = start;

while ( start < end){

dc_civac( x0_kaddr );

x0_kaddr += cache_line_size;

}

If "end - start" > L1 D-cache size, the loop will still run, however, the "x0_kaddr" address no longer exists in the D-cache.

windxlnx
  • 15
  • 5

1 Answers1

2

Your confusion comes from fact that you thinking in terms of cache lines somehow mapped on top of some memory range. But function is Invalidate range by virtual address in terms of available mapped memory.

So far as start and end parameters are valid virtual addresses of general memory that's fine.


Memory range does not have to be cached as a whole, only some data out of given range might be cached or none at all.

So say there is 2MB buffer in physical DDR memory that's mapped and could be accessed by virtual addresses.
Say L1 is 32KB.
So up to 32KB out of 2MB buffer might be cached (or none at all). You don't know what part, if any, is in cache.
For that reason you run a loop over virtual addresses of your 2MB buffer. If data block of cache_line_size is in cache, that cache line would be invalidated. If data is not in cache and only in DDR memory, that's basically a nop.

It's good practice to provide start and end addresses aligned to cache_line_size, because memory controller would clip lower bits and you might miss cleaning some data in buffer tail.

PS: if you want to operate directly on cache lines, there is other functions for that. And they takes way and set parameters to address directly cache lines.

user3124812
  • 1,861
  • 3
  • 18
  • 39
  • Thank you for your detailed reply. I used dma_sync_single_for_cpu in my driver, and found it ends up with dcache_inval_poc (start, end) in my arm64 platform, the 2nd argument of dma_sync_single_for_cpu is `dma_addr_t addr`, which is equal to the physical address in my platform because IOMMU is disabled, but why the [comments(https://github.com/torvalds/linux/blob/v5.15/arch/arm64/mm/cache.S#L149), and you said " Invalidate range by **virtual** address ", I think both start and end are **physical** address here, can you please point out? – windxlnx Jan 06 '23 at 03:06
  • No, addresses are virtual. Refer description of `dc ivac` (Line: 163) instruction for example. (https://developer.arm.com/documentation/ddi0601/2020-12/AArch64-Instructions/DC-IVAC--Data-or-unified-Cache-line-Invalidate-by-VA-to-PoC). It's `Invalidate by VA (Virtual Address)`. As a matter of fact caches could not be enabled without enabling MMU. Once MMU is enabled, all memory accesses are virtual. – user3124812 Jan 06 '23 at 03:26
  • oops, you are right, there is an address translation from PA to VA before going to dcache_inval_poc, https://elixir.bootlin.com/linux/latest/source/arch/arm64/mm/dma-mapping.c#L27. The last question, you mentioned, "If data is not in cache and only in DDR memory, that's basically a nop.", which is a key message to me, can you please point out to me any docs to help me understand this clearly? Again, thank you for your kind reply, appreciated! – windxlnx Jan 06 '23 at 04:38
  • That sort of information in chip manuals. Look for instruction descriptions (like `dc ivac`, `dc civac`) in "ARMv8 Architecture Reference Manual" – user3124812 Jan 06 '23 at 05:46