0

I am using ARM Cortex-A9 (zynq7000) and I want to enable the neon SIMD but not to use it for floating points unless specified.

When compiled by arm-none-eabi-gcc with following fpu options (seperately) :

  1. mfpu=vfpv3 -mfloat-abi=softfp ,
  2. mfpu=neon-vfpv3 -mfloat-abi=softfp,
  3. mfpu=neon -mfloat-abi=softfp,

the binaries 1 & 2 are different. But 2&3 are the same (vectorization not enabled),   I am using -Og for optimization. ( -Og does not enable Vectorize options)

How can I make sure that all floating points are done in VFP, not the NEON when I use the option mfpu=neon-vfpv3?

According to the ARM Architecture Reference Manual, NEON and VFP support similar Instructions, which makes it difficult to distinguish the difference just by checking disassembly.

Moreover, I am planning to use  #pragma GCC ivdep for the loops and functions that I need to vectorize, and what would be the appropriate compiler flags to achieve this?

Salinda
  • 3
  • 2

2 Answers2

0

The compiler will never use any neon instruction unless auto vectorization is enabled or enforced via intrinsics.

Even though neon and vfp instructions look similar, they even operate in a different mode each.

There are a few instructions shared by vfp and neon on armv7 (mostly memory related), but they shouldn't be of any concern.

Why don't you post the disassemblies?

Jake 'Alquimista' LEE
  • 6,197
  • 2
  • 17
  • 25
  • Thank you, However, It is difficult to post the disassemblies online. I checked with the [link]https://godbolt.org/ and realized -Og option prevents GCC from vectorizing even if free-vectorization is enabled. – Salinda Jun 24 '21 at 05:49
0

-mfpu=

-mfloat-abi=

  • soft: VFP is not used and uses ARM Calling Convention
  • softfp: VFP is used but uses ARM Calling Convention ( ARM R registers are used to pass parameters to functions)
  • hard: VFP is used and Calling Convention is specific to the H/W ( along with ARM R registers VFP/NEON S and D registers are used to pass parameters to functions. S/D registers are used for floating-point parameters called by value) ​

Floating Point operations on NEON(SIMD)

  • Unless the option ​ funsafe-math-optimizations is set in GCC, Neon is NOT used for Floating-Point operations. (Neon does not follow IEEE 754 )

vfp and neon instructions in disassembly:

in case of vmov,

  • The vfp uses only vmov.f32 and vmov.f64
  • neon uses vmov.i64, vmov.i32, and so on.

Loop Vectorization

  • For Loop Vectorization -ftree-vectorize and -O2 or -O3 Optimization Option can be used

    When -Og Optimization is used Loops may not get vectorized automatically

  • vectorization of loops with neon

Salinda
  • 3
  • 2