2

gcc seems to classify fcvtzs d0,d0 as as SIMD instruction, but clang does not. Who is right?

$ cat toto.s
    fcvtzs d0,d0
$ aarch64-linux-gnu-gcc-10 -mcpu=cortex-a53+nosimd -c toto.s
toto.s: Assembler messages:
toto.s:1: Error: selected processor does not support `fcvtzs d0,d0'
$ clang -target aarch64-linux-gnu -mcpu=cortex-a53+nosimd -c toto.s
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
David Monniaux
  • 1,948
  • 12
  • 23
  • Do you actually have a Cortex-A53 device that supports floating-point but not SIMD? I'd expect that most chips would either support both, or support neither; in the latter case you would want `-mcpu=cortex-a53+nosimd+nofp` and then I assume both assemblers should reject `fcvtzs`. – Nate Eldredge Jun 19 '22 at 15:11
  • I'm trying to benchmark floating-point code with and without SIMD, and `+nosimd` seems to be the way to go to prevent gcc from generating SIMD instructions. – David Monniaux Jun 20 '22 at 09:54
  • 1
    Like in James's answer, I don't think there's any official categorization of instructions as "FP" versus "SIMD", so it may be just an arbitrary decision by the compiler authors. But for benchmarking, I would guess that what you want to look at is not the specific instructions, but the auto-vectorization optimization more generally. Which IIRC you can disable with `-fno-tree-vectorize`. – Nate Eldredge Jun 20 '22 at 19:06

1 Answers1

2

You’re far in to the arcane classification of instructions as practically speaking FP and Advanced SIMD are always available together.

I would read the Arm definition of FCVTZS as supporting GCC’S classification of the SISD form of FCVTZS (reading and writing D registers) as an instruction that requires +simd. The reasoning would be the encoding class of the instruction (Scalar single-precision and double-precision) and the shared pseudo-code calling CheckFPAdvSIMDEnabled64.

I say the question gets a bit arcane, because the architecture pseudocode definition of CheckFPAdvSIMDEnabled64 looks like this!

AArch64.CheckFPAdvSIMDEnabled()
    AArch64.CheckFPEnabled();

One technicality; your error message comes from the assembler not GCC; until recently these two tools also disagreed with each other.

James Greenhalgh
  • 2,401
  • 18
  • 17