3

I was reading Robert Nystrom's Crafting Interpreters section on NaN boxing/tagging, and was curious about his oblique mention of not wanting to collide with a special NaN value which he calls "Intel’s 'QNaN Floating-Point Indefinite'" (a hazard when storing data in specially crafted NaN payloads).

From this reference sheet provided by intel, they do define a "Real Indefinite" as the NaN representation returned by various invalid arithmetic operations (0.0/0.0, -infinity + infinity, etc). This sheet defines the Real Indefinite as a QNaN with sign bit '1' and the payload zeroed (the payload being all fraction bits lower than the one used to distinguish quiet NaNs from signaling NaNs).

To avoid arithmetic producing a NaN that the program disastrously misinterprets as one of the special tagged NaNs, Nystrom sets the highest payload bit to 1 when storing non-double types, which seems fine, but...

Wouldn't one achieve the same result by marking tagged NaNs with a zero sign bit (rather than using the most significant payload bit for this purpose as Nystrom does), with the added simplicity of keeping the usable bits contiguous?

Indefinite real:
_______________________
|1|1.....1|1|0.......0|
 ^ ^       ^ ^
 | exp     | payload
 sign      quiet NaN  


Tagged (sign flag):
_______________________
|0|1.....1|1|x.......x|


Tagged (payload flag):
_______________________
|x|1.....1|11|x......x|
Jack Harwood
  • 348
  • 2
  • 7

1 Answers1

3

Answering my own question immediately in case someone else gets the same idea...

No, there are other operations not listed in that table which might produce a NaN value with a zeroed sign bit. It's better to use the bits in the payload to distinguish tagged NaNs from something produced by ordinary arithmetic (assuming the architecture and environment leaves the payload vacant).

See this related answer quoting IEEE 754 2019 6.3:

When either an input or result is a NaN, this standard does not interpret the sign of a NaN. However, operations on bit strings—copy, negate, abs, copySign—specify the sign bit of a NaN result, sometimes based upon the sign bit of a NaN operand. The logical predicates totalOrder and isSignMinus are also affected by the sign bit of a NaN operand. For all other operations, this standard does not specify the sign bit of a NaN result, even when there is only one input NaN, or when the NaN is produced from an invalid operation…

This C++ example code tests if taking the absolute value can change the NaN sign (Godbolt gcc/clang currently do), which could result in a NaN being misinterpreted under that scheme:

float n = 0.0f, d = 0.0f;
float f = n / d;
uint32_t bytes;

std::memcpy(&bytes, &f, 4);
std::cout << "(0.0f/0.0f):\n";
std::cout << " Sign bit: " << (bytes >> 31) << "\n";
std::cout << " Hex:      " << std::hex << bytes << "\n";

f = std::fabs(f);

std::memcpy(&bytes, &f, 4);
std::cout << "abs(0.0f/0.0f):\n";
std::cout << " Sign bit: " << (bytes >> 31) << "\n";
std::cout << " Hex:      " << std::hex << bytes << "\n";

(0.0f/0.0f):
 Sign bit: 1
 Hex:      ffc00000
fabs(0.0f/0.0f):
 Sign bit: 0
 Hex:      7fc00000
Jack Harwood
  • 348
  • 2
  • 7
  • 2
    Yup, `fabs` and `copysign` are typically just bit-manipulation with no special handling of NaN, especially on x86-64 and other modern ISAs where SIMD instructions use the same registers as scalar FP, so there are instructions like `andpd` / `orpd` / `xorpd`. https://godbolt.org/z/nWaf8svEn shows `fabs` compiling to one `andps` instruction. – Peter Cordes Apr 13 '23 at 01:49