0

I have two doubles x and y with y known to be a particular NaN value, I'd like to see if they are bitwise identical. That is, I want to determine if x is exactly the same NaN value as y.

NaNs cannot be usefully compared with the == opeator since NaNs are never equal to any other value (not even themselves!).

Is there something better than the following "bitwise equal" approach (and is this approach legal?):

bool bitwise_equal(double x, double y) {
    unsigned char xbytes[sizeof(x)];
    unsigned char ybytes[sizeof(y)];
    memcpy(xbytes, &x, sizeof(x));
    memcpy(ybytes, &y, sizeof(y));
    return memcmp(xbytes, ybytes, sizeof(x)) == 0;
}
BeeOnRope
  • 60,350
  • 16
  • 207
  • 386
  • 4
    You do not need to copy the bytes first, just use `return 0 == memcmp(&x, &y, sizeof x);`. – Eric Postpischil Apr 18 '21 at 23:08
  • 1
    In C, many things are not “legal” versus “illegal”; they are defined by the C standard or not defined (and sometimes in-between, like implementation-defined or defined to be one of a set of possibilities or to satisfy certain properties). `memcmp(&x, &y, sizeof x);` is defined to compare the bytes. If `double` is an IEEE-754 binary type with no padding, it will return true if and only if they are the same number (counting −0 and +0 as different) or same NaN. (If it is a decimal type, there can be alternate representations of the same number.) – Eric Postpischil Apr 18 '21 at 23:12
  • 1
    Note that a NaN should propagate through various operations. In IEEE-754-conforming implementations, if exactly one operand of `c = a + b` is a NaN, `c` should be that NaN. If both operands are NaNs, `c` should be one of them. So you can expect that a `NaN` result of a chain of calculations will be one of the input NaNs or a NaN generated by an exception condition. However, a negation may change the sign of a NaN. So then it will be the same in payload (significand bits) but not fully bitwise identical. If you want to handle that case, you need to exclude the sign bit from the comparison. – Eric Postpischil Apr 18 '21 at 23:18
  • There are two types of NaN in 754 (see wiki : https://en.wikipedia.org/wiki/IEEE_754) A single point floating number is 4 bytes. So casting as uint and comparing is more efficient than doing a mem compare. – jdweng Apr 18 '21 at 23:30
  • 3
    @jdweng: If by “casting as a uint” you mean `* (uint32_t *) &x`, then the behavior of that is not defined by the C standard. That there are two types of NaN, signaling and quiet, is irrelevant unless OP wants to accept a signaling NaN that has been converted to a quiet NaN as the same as the original. The question asks about “doubles”, not “a single point floating number”, and a `double` is most commonly eight bytes, not four. – Eric Postpischil Apr 19 '21 at 00:14
  • @EricPostpischil - right, by "legal" I basically mean "is not UB". That is, I at least want an approach that can't be compiled away by a sufficiently smart compiler, although I realize I'll have to accept some platform-dependent behavior. – BeeOnRope Apr 19 '21 at 03:23
  • @BeeOnRope To Eric's point, please clarify in the questionwhether the QNaN vs SNaN conversion is relevant to your use case. For example, if I want to make sure that `r = op (a,b)` behaves properly with regard to NaN pass-through, I would typically check `bitwise_identical (r, snan_to_qnan(a)) || bitwise_identical (r, snan_to_qnan(b))`. Note that there may not be support for NaN payloads, as the use of a canonical NaN is compliant with the standard. – njuffa Apr 19 '21 at 04:22
  • @njuffa - my requirement is that the `NaN` with bit pattern `0x7ff00000000007a2` be recognized later on (no math is performed on it, only memcpy from a `uint64_t` containing the above pattern). Later, I want to check if this `double` field value has the above pattern. Specifically, this is R's "NA" (not available) value for doubles. – BeeOnRope Apr 19 '21 at 21:34
  • @BeeOnRope That is important information that it is *not* participating in any arithmetic operation, because on x86 (and some other architectures) this is an SNaN encoding, and the masked response of an arithmetic op would be to turn it into the corresponding QNaN, resulting in `0x7ff80000000007a2`. – njuffa Apr 19 '21 at 21:58
  • @njuffa - right, it turns out to be an important distinction because checking the R code, they [actually don't](https://github.com/wch/r-source/blob/79298c499218846d14500255efd622b5021c10ec/src/main/arithmetic.c#L125) care about the signalling part or not: they are _only_ checking the low order bits for the 1954 (`0x7a2`) magic value. So probably I never wanted a bitwise comparison in the first place, XY problem and all that. In any case, I'll keep my question as-is since it's too late to twist it around my specific requirement. – BeeOnRope Apr 19 '21 at 22:15
  • 1
    If your only after x86_64 I think `vpcmpeq` could be useful as you only need 1 3latency `vmovq` instruction to get the result back as opposed to needing 2 to move the doubles to GPR. Had to restort to inline assembly to get what I think is ideal code but [here](https://godbolt.org/z/GKMEPYv1e) are two versions I think a bit faster. Not sure how you could get the compiler to emit this w.o making your code x86_64 specific though. – Noah May 01 '21 at 00:47

1 Answers1

2

Compare two doubles to see if they are the same NaN

if they are bitwise identical

I want to determine if x is exactly the same NaN value as y.

Directly comparing bits patterns with memcmp() is a reasonable approach.

C does not specify much detail about NaN payload sameness. Note: An implementation may defined "same" with multiple bit-patterns. See sign consideration.

#include <math.h>
#include <stdbool.h>
#include <string.h>

bool NaN_bitwise_equal(double x, double y) {
    return isnan(x) && isnan(y) && memcmp(&x, &y, sizeof x) == 0;
}

"same NaN value" is something of a contradiction as the values are not numerically value comparable, but the NaN payload may be.

user3386109
  • 34,287
  • 7
  • 49
  • 68
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • Passing the NaN by value might change its representation. Taking the arguments by pointer might be more viable for OP – M.M May 03 '21 at 04:33