I am writing custom linux driver that needs to DMA memory around between multiple PCIE devices. I have the following situation:
- I'm using dma_alloc_coherent to allocate memory for DeviceA
- I then use DeviceA to fill the memory buffer.
Everything is fine so far but at this point I would like to DMA the memory to DeviceB and I'm not sure the proper way of doing it.
For now I am calling dma_map_single for DeviceB using the address returned from dma_alloc_coherent called on DeviceA. This seems to work fine in x86_64 but it feels like I'm breaking the rules because:
dma_map_single is supposed to be called with memory allocated from kmalloc ("and friends"). Is it problem being called with an address returned from another device's dma_alloc_coherent call?
If #1 is "ok", then I'm still not sure if it is necessary to call the dma_sync_* functions which are needed for dma_map_single memory. Since the memory was originally allocated from dma_alloc_coherent, it should be uncached memory so I believe the answer is "dma_sync_* calls are not necessary", but I am not sure.
I'm worried that I'm just getting lucky having this work and a future kernel update will break me since it is unclear if I'm following the API rules correctly. My code eventually will have to run on ARM and PPC too, so I need to make sure I'm doing things in a platform independent manner instead of getting by with some x86_64 architecture hack.
I'm using this as a reference: https://www.kernel.org/doc/html/latest/core-api/dma-api.html