8

I am using a hardware interface to send data that requires me to set up a DMA buffer, which needs to be aligned on 64 bits boundaries.

The DMA engine expects buffers to be aligned on at least 32 bits boundaries (4 bytes). For optimal performance the buffer should be aligned on 64 bits boundaries (8 bytes). The transfer size must be a multiple of 4 bytes.

I create buffers using posix_memalign, as demonstrated in the snippet bellow.

  posix_memalign ((void**)&pPattern, 0x1000, DmaBufferSizeinInt32s * sizeof(int))

pPattern is a pointer to an int, and is the start of my buffer which is DmaBufferSizeinInt32s deep.

Is my buffer aligned on 64bits?

JΛYDΞV
  • 8,532
  • 3
  • 51
  • 77
Krakkos
  • 1,439
  • 3
  • 19
  • 24

3 Answers3

8

Yes, your buffer IS aligned on 64-bits. It's ALSO aligned on a 4 KByte boundary (hence the 0x1000). If you don't want the 4 KB alignement then pass 0x8 instead of 0x1000 ...

Edit: I would also note that usually when writing DMA chains you are writing them through uncached memory or through some kind of non-cache based write queue. If this is the case you want to align your DMA chains to the cache line size as well to prevent a cache write-back overwriting the start or end of your DMA chain.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Goz
  • 61,365
  • 24
  • 124
  • 204
  • Not sure if I need the 4KByte boundary alignment... should I? – Krakkos Oct 30 '09 at 12:20
  • I am basically writing data object which are 10 x 32bit words. I want to send whole numbers of 10 x 32bit words each time. I'm currently DMA'ing 400 x 320bit data objects in each DMA transfer. I'm not sure how the size of my buffer (400 x 10 x 32bits) is related to the alignment, if at all. Should I tweak the size of the buffer? – Krakkos Oct 30 '09 at 12:23
  • I can't answer that question. I don't know what your platform is, for one. Under windows memory pages are allocated in 4K pages. This means you can only set an entire page to uncached at a time and thus you may well need the 4K alignment. Alas, though, I cannot say for sure without knowing a lot more about your system ... – Goz Oct 30 '09 at 12:37
  • System is RedHat Enterprise Linux kernel 2.6.18.8. Running embedded on a single board computer. – Krakkos Oct 30 '09 at 13:43
  • Running on an x86? If so I'd guess that linux also uses 4K pages in the TLB so the 4K alignment would be there to ensure you definitely aren't in the cache and not affecting things that are supposed to be cached. – Goz Oct 30 '09 at 13:52
  • @Krakkos - it seems like you might want to ask a question about pointers to good information or tutorials for performing DMA on Linux systems. It looks like buffer alignment issues might only be the start of what you need to know about what I'd guess is a pretty complex subject. It sure as hell is complex on Windows anyway. – Michael Burr Oct 30 '09 at 17:32
  • @Michael... That's a fair comment... I am stumbling around really. :) Luckily the underlying device driver takes care of all the actual DMA processing, but I have to allocate some memory, which I will use as a data buffer. I then pass a pointer to the buffer to the device driver and it does the DMA transfer. The only stipulation seems to be that it is aligned on a minimum of 64bits for good performance. – Krakkos Nov 03 '09 at 10:45
3

As Goz pointed out, but (imo) a bit less clearly: you're asking for alignment by 0x1000 bytes (the second argument), which is much more than 64 bits.

You could change the call to just:

posix_memalign ((void**)&pPattern, 8, DmaBufferSizeinInt32s * sizeof(int)))

This might make the call cheaper (less wasted memory), and in any case is clearer, since you ask for something that more closely matches what you actually want.

unwind
  • 391,730
  • 64
  • 469
  • 606
  • OK, I think I see now... the middle argument to `posix_memalign`, is the alignment. And whilst my value was a factor of 64bits, it was actually set to 4096bytes. – Krakkos Oct 30 '09 at 13:46
0

I don't know your hardware and I don't know how you are getting your pPattern pointer, but this seems risky all around. Most DMA I am familiar with requires physical continuous RAM. The operating system only provides virtually continuous RAM to user programs. That means that a memory allocation of 1 MB might be composed of up to 256 unconnected 4K RAM pages.

Much of the time memory allocations will be made of continuous physical pieces which can lead to things working most of the time but not always. You need a kernel device driver to provide safe DMA.

I wonder about this because if your pPattern pointer is coming from a device driver, then why do you need to align it more?

Zan Lynx
  • 53,022
  • 10
  • 79
  • 131