ARM Cortex-A9 NEON and VFP

Question

I am using ARM Cortex-A9 (zynq7000) and I want to enable the neon SIMD but not to use it for floating points unless specified.

When compiled by arm-none-eabi-gcc with following fpu options (seperately) :

mfpu=vfpv3 -mfloat-abi=softfp ,
mfpu=neon-vfpv3 -mfloat-abi=softfp,
mfpu=neon -mfloat-abi=softfp,

the binaries 1 & 2 are different. But 2&3 are the same (vectorization not enabled), I am using -Og for optimization. ( -Og does not enable Vectorize options)

How can I make sure that all floating points are done in VFP, not the NEON when I use the option mfpu=neon-vfpv3?

According to the ARM Architecture Reference Manual, NEON and VFP support similar Instructions, which makes it difficult to distinguish the difference just by checking disassembly.

Moreover, I am planning to use #pragma GCC ivdep for the loops and functions that I need to vectorize, and what would be the appropriate compiler flags to achieve this?

Jake 'Alquimista' LEE · Accepted Answer · 2021-06-21T12:06:39.213

0

The compiler will never use any neon instruction unless auto vectorization is enabled or enforced via intrinsics.

Even though neon and vfp instructions look similar, they even operate in a different mode each.

There are a few instructions shared by vfp and neon on armv7 (mostly memory related), but they shouldn't be of any concern.

Why don't you post the disassemblies?

edited Jun 21 '21 at 12:06

answered Jun 21 '21 at 11:55

Jake 'Alquimista' LEE

6,197
2
17
25

Thank you, However, It is difficult to post the disassemblies online. I checked with the [link]https://godbolt.org/ and realized -Og option prevents GCC from vectorizing even if free-vectorization is enabled. – Salinda Jun 24 '21 at 05:49

score 0 · Answer 2 · answered Apr 07 '22 at 10:48

-mfpu=

In GCC(arm) when the -mcpu=cortex-a9 or -march=armv7-a is set the option mfpu=neon-vfpv3 and mfpu=neon are identical.

‘+neon’ https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html

-mfloat-abi=

soft: VFP is not used and uses ARM Calling Convention
softfp: VFP is used but uses ARM Calling Convention ( ARM R registers are used to pass parameters to functions)
hard: VFP is used and Calling Convention is specific to the H/W ( along with ARM R registers VFP/NEON S and D registers are used to pass parameters to functions. S/D registers are used for floating-point parameters called by value)

Floating Point operations on NEON(SIMD)

Unless the option funsafe-math-optimizations is set in GCC, Neon is NOT used for Floating-Point operations. (Neon does not follow IEEE 754 )

vfp and neon instructions in disassembly:

in case of vmov,

The vfp uses only vmov.f32 and vmov.f64
neon uses vmov.i64, vmov.i32, and so on.

Loop Vectorization

For Loop Vectorization -ftree-vectorize and -O2 or -O3 Optimization Option can be used

When -Og Optimization is used Loops may not get vectorized automatically
vectorization of loops with neon

ARM Cortex-A9 NEON and VFP

2 Answers2