2

I am using the NEON memory copy with preload implementation from the ARM website with the Windows Embedded Compact 7 ARM assembler on a Cortex-A8 processor.

I notice that I get datatype misalignment exceptions when I provide that function with non word aligned values

For example:

; NEON memory copy with preload
ALIGN
LEAF_ENTRY NEONCopyPLD
    PLD [r1, #0xC0]
    VLDM r1!,{d0-d7} ;datatype misalignment
    VSTM r0!,{d0-d7}
    SUBS r2,r2,#0x40
    MOV R0, #0
    MOV PC, LR
ENTRY_END

size_t size = /* arbitrary */;
size_t offset = 1;
char* src = new char[ size + offset ];
char* dst = new char[ size ];

NEONCopyPLD( dst, src + offset, size );

memcpy( dst, src + offset, size ); /* works perfectly */

Is this expected for the VLDM command? The article doesn't mention that this implementation is limited to word-aligned values. Is it fixable? If so, how?

PaulH
  • 7,759
  • 8
  • 66
  • 143
  • 1
    For fix, have a look at http://review.android.git.linaro.org/gitweb?p=platform/bionic.git;a=commitdiff;h=f1dd5e8c215b080bb2f4cf22 – auselen Dec 10 '12 at 16:58

1 Answers1

0

Even if you don't specify an explicit alignment requirement you still need to align the data on an element boundary (i.e. on a doubleword boundary in this case). There are some exceptions to this rule, but it's probably best not to rely on them unless you have a really good reason to do so.

See the Cortex-A8 technical reference manual (ARM DDI 0344J) for more information.

Michael
  • 57,169
  • 9
  • 80
  • 125
  • Okay, the table in A3.2.1 says VLDM requires word alignment. But there are other variants (VLD1..4) that allow unaligned access. Can I replace VLDM with one of them and get the behavior I expect from memcpy? – PaulH Dec 10 '12 at 16:19
  • I would expect the element alignment requirement to apply to all those instructions, according to the "4.2.1 NEON data alignment" section in the reference manual. I won't swear on it though. – Michael Dec 10 '12 at 16:29