0

I'm using the Eigen library to do some computation on an iPad 2. (ie. cortex-a9). It seems that some operations are vectorized using NEON instructions, while others aren't.

Operations that I've tried that get vectorized: dot products, vector and matrix additions and subtractions.

Operations that don't get vectorized: matrix multiplication.

I'm using these operations inside the same project and same file, so the compiler options are the same. I'm using -O3 -mcpu=cortex-a9 -mfpu=neon -mfloat-abi=softfp.

All matrices that I'm using have Dynamic sizes. Is there anything I'm doing wrong, or is this the expected behaviour?

Thanks.

jabaldonedo
  • 25,822
  • 8
  • 77
  • 77
user1906
  • 2,310
  • 2
  • 20
  • 37
  • How precisely are you coming to the conclusion that some operations are vectorised and others not? Inspection of the code? Both GCC and Clang emit neon instructions for floating point operations for NEON equipped FP unit on Cortex A-series parts. Also, sure you really want `-mfloat-abi-softfp` for iOS? This is common in Linux-land where people like building software that is compatible with lots of different ARM arch versions - but with a nasty run-time penalty. Apple opts for fat binaries instead. – marko Jun 10 '13 at 13:36
  • I'm using Xcode Instruments to check the assembler code. For a dot product I see a bunch of `vadd` and `vmov`, but not for the matrix multiplication. Also, the dot product results in a big improvement over the OpenCV function (roughly 50%), however the matrix multiplication does not. – user1906 Jun 11 '13 at 07:08

1 Answers1

0

When you use -mfpu=neon gcc/clang will vectorize integer operations, but not floating-point because NEON is not 100% IEEE-complaint (it doesn't support denormal numbers). You have to specify -ffast-math to make gcc/clang vectorize floating-point code with NEON. However, you must be careful as -ffast-math can affect the numerical results.

Marat Dukhan
  • 11,993
  • 4
  • 27
  • 41
  • no, actually Eigen explicitly vectorizes its operation when NEON is properly enabled, regardless of the presence of -ffast-math. I guess the problem is with the option -mfloat-abi-softfp that should be removed. – ggael Jun 10 '13 at 14:39
  • `-mfloat-abi=softfp` means "use general-purpose registers for parameter passing, but hardware instructions for FP computations". This shouldn't be a problem, Android programs always use general-purpose registers for parameter passing. – Marat Dukhan Jun 10 '13 at 14:42
  • `-mfloat-abi=softfp` is a nasty compromise that has been adopted on ARM Linux systems, including Android. There are undoubtedly cases where this option is a big performance hit, especially considering that there's a huge pipeline stall moving data between NEON and integer unit - which has to happen in function prologues and epilogues when arguments are passed in integer registers One of the reasons it continues to be used is 'black-boxes' of IP supplied by SoC vendors - think OpenGL and user-space libraries for accessing camera hardware - which come built with this ABI convention. – marko Jun 11 '13 at 09:22
  • 1
    Incidentally, Apple-supplied `clang`s seem to happily vectorize when optimising without `-ffast-math` - and a search of the documentation isn't showing anything about this option either. I seem to recall this also being the case with the ancient version of GCC that ships with Xcode, but will be deprecated in the next release. – marko Jun 11 '13 at 09:36