ARM Floating Point Operations

Question

I have two questions regarding floating point operations regarding ARM Cortex M4, Cortex M33 and Cortex M0 core with floating point co-processor.

Though optional, almost all major ARM Cortex M4 and Cortex M33 implementation generally have a FPU built-in the core.

While Cortex M0 may have an FPU or Math co-processor as an peripheral.

My questions

To use FPU for floating point operation do I have to use functions like __aeabi_fadd given in the link or simple mathematical operators like +, -, /, * will suffice.
I believe Cortex M0 that may have an FPU or Math co-processor as an peripheral we will require such functions as in the case of RP2040 (Raspberry Pi Pico).
Why do we have separate __aeabi_fsub and __aeabi_frsub, should reversing the parameters suffice or am I missing something.

__aeabi_fsub 2 float float Return x minus y __aeabi_frsub 2 float float Return y minus x

Cortex M4

Cortex M33

those functions are the soft float, while there can/will be situations where a math operation from C may not fall into something the hardware can do. I think forward and reverse may be or have been part of the spec and may matter for NaNs and such, but I dont know. It may also help with register allocation and/or flags, etc. — old_timer, Jul 31 '22 at 18:37
it is not built into any arm core it is a coprocessor, I had not heard that it was possible for an m0. m4 and m7 sure, possible. chip vendor chooses what ip they want when they buy from arm. newer ones after m7, m33, etc I dont know as well but trivial to look up. — old_timer, Jul 31 '22 at 18:39
Also I think they fixed length like single only not single, double, extended. You read the chip docs, the chip docs tell you what core they bought from arm, unfortunately the chip vendor does not always indicate , what other options for they core they chose, so you have to do experiments sometimes. but you choose the architecture and floating point options or not when you compile. — old_timer, Jul 31 '22 at 18:43
you can easily do simple experiments with simple function calls to see what the compiler generates with respect to these soft float library functions vs real instructions. compile to object and disassemble. — old_timer, Jul 31 '22 at 18:45
"or simple mathematical operators like +, -, /, * will suffice?": In what language, with what compiler? — Nate Eldredge, Jul 31 '22 at 20:06
@NateEldredge I am talking about C language and GNU ARM Embedded Compiler as ARM Cortex M-Series is generally found in micro-controller. I am not talking about Heterogeneous computing. — Dark Sorrow, Aug 01 '22 at 03:37
remember those marketing pictures are what is available, the chip company (arm is not a chip company)(st, nxp, ti, etc) purchases a particular product, and then has many customization options including not using the fpu. — old_timer, Aug 01 '22 at 13:25

old_timer · Accepted Answer · 2022-08-01T13:31:57.657

float fun1 ( float a )
{
    return(a+2.0F);
}
double fun2 ( double a )
{
    return(a+3.0);
}

arm-none-eabi-gcc -O2 -c -mcpu=cortex-m4 -mfpu=vfp -mfloat-abi=hard so.c -o so.o
arm-none-eabi-objdump -d so.o

so.o:     file format elf32-littlearm


Disassembly of section .text:

00000000 <fun1>:
   0:   eddf 7a02   vldr    s15, [pc, #8]   ; c <fun1+0xc>
   4:   ee30 0a27   vadd.f32    s0, s0, s15
   8:   4770        bx  lr
   a:   bf00        nop
   c:   40000000    .word   0x40000000

00000010 <fun2>:
  10:   ed9f 7b03   vldr    d7, [pc, #12]   ; 20 <fun2+0x10>
  14:   ee30 0b07   vadd.f64    d0, d0, d7
  18:   4770        bx  lr
  1a:   bf00        nop
  1c:   f3af 8000   nop.w
  20:   00000000    .word   0x00000000
  24:   40080000    .word   0x40080000

arm-none-eabi-gcc -O2 -c -mcpu=cortex-m4 so.c -o so.o

arm-none-eabi-objdump -d so.o

so.o:     file format elf32-littlearm


Disassembly of section .text:

00000000 <fun1>:
   0:   b508        push    {r3, lr}
   2:   f04f 4180   mov.w   r1, #1073741824 ; 0x40000000
   6:   f7ff fffe   bl  0 <__aeabi_fadd>
   a:   bd08        pop {r3, pc}

0000000c <fun2>:
   c:   b508        push    {r3, lr}
   e:   2200        movs    r2, #0
  10:   4b01        ldr r3, [pc, #4]    ; (18 <fun2+0xc>)
  12:   f7ff fffe   bl  0 <__aeabi_dadd>
  16:   bd08        pop {r3, pc}
  18:   40080000    .word   0x40080000


arm-none-eabi-gcc -O2 -c -mcpu=cortex-m4 -mfpu=fpv4-sp-d16 -mfloat-abi=hard so.c -o so.o
arm-none-eabi-objdump -d so.o

so.o:     file format elf32-littlearm


Disassembly of section .text:

00000000 <fun1>:
   0:   eef0 7a00   vmov.f32    s15, #0 ; 0x40000000  2.0
   4:   ee30 0a27   vadd.f32    s0, s0, s15
   8:   4770        bx  lr
   a:   bf00        nop

0000000c <fun2>:
   c:   b508        push    {r3, lr}
   e:   ec51 0b10   vmov    r0, r1, d0
  12:   4b03        ldr r3, [pc, #12]   ; (20 <fun2+0x14>)
  14:   2200        movs    r2, #0
  16:   f7ff fffe   bl  0 <__aeabi_dadd>
  1a:   ec41 0b10   vmov    d0, r0, r1
  1e:   bd08        pop {r3, pc}
  20:   40080000    .word   0x40080000

looking at the cortex-m4 trm it says single precision so you would get into trouble with the above code. gnu is not going to know about every chip and variation out there, and will never, so you the programmer have to tell the tools and/or write code that falls within the hardware rules. — old_timer, Aug 01 '22 at 13:27
arm says FPv4-SP extensions I dont know what the -d16 means. As with all of this you should be able to figure this out. — old_timer, Aug 01 '22 at 13:32
vmov d0, r0, r1 that does not look good. If it were me I would simply not do any double (remember to handle your constants properly). — old_timer, Aug 01 '22 at 13:34

ARM Floating Point Operations

1 Answers1