1

Consider this example, in which various rounding operations (round-up, round-down, round-toward-zero and round-to-nearest-with-ties-to-even) can all be expressed with a single roundsd instruction:

use_floor(double):
        roundsd xmm0, xmm0, 9
        ret
use_ceil(double):
        roundsd xmm0, xmm0, 10
        ret
use_trunc(double):
        roundsd xmm0, xmm0, 11
        ret
use_nearby(double):
        roundsd xmm0, xmm0, 12
        ret

While round-to-nearest-with-ties-away-from-zero requires additional instructions:

use_round(double):
        movapd  xmm1, xmm0
        andpd   xmm0, XMMWORD PTR .LC1[rip]
        orpd    xmm0, XMMWORD PTR .LC0[rip]
        addsd   xmm0, xmm1
        roundsd xmm0, xmm0, 3
        ret

Why does this rounding mode require more instructions on x86 (unlike on Arm) and how do these bit operations on a floating-point value end up implementing the desired semantics?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
soc
  • 27,983
  • 20
  • 111
  • 215
  • 4
    Why? Because the ISA designers did not include such a rounding mode. It works by isolating the sign bit then applying that to `0.499...` to create the appropriate signed offset which is then added so the round toward zero produces the correct result. It's basically `trunc(x < 0 ? x - 0.499 : x + 0.499)` – Jester Jan 06 '22 at 13:27
  • x86 chose to support those four rounding modes and no others, https://www.felixcloutier.com/x86/roundpd#fig-4-24. ARM chose to also support the fifth one. – Nate Eldredge Jan 07 '22 at 01:57

0 Answers0