C Double Literal NaN with Payload

Question

I am using NaNs with payloads (so that the mantissa contains important information but is still treated as a NaN). For example one such value may be represented using IEEE-754-1985 in hex as FFF8000055550001 which has a sign bit of 1, the exponent of 7FF for a NaN/infinity, the quiet NaN bit set (at least on most architectures), and a payload of 0x55550001.

However this has a few problems. First, this cannot be created easily as a literal in C/C++ since the common methods all cannot be used to initialize literals:

hex-literal notation for doubles (the best option, see here) but only seems to support ~~finite values and not infinities or NaNs~~ non-NaN values (I can get infs by using p1024 and p-1024 but the mantissa is ignored in this case)
memcpy is nice as it avoids aliasing but requires a function call
reinterpret_cast is C++ only (which is okay) but requires the operand to be a variable or pointer and not a literal
union-type-punning but I don't think this can be used to initialize a static-time constant

Is there any method to setup the static constant for a NaN with a payload? It may be assumed that the system is IEEE-754-1985 compliant and that longs and doubles have the same endian-ness.

I guess you can use [union](https://stackoverflow.com/q/2148989/995714) for this purpose — phuclv, Dec 21 '17 at 10:25
"memcpy is nice as it avoids aliasing but requires a function call" --> No. A compiler can optimize the function call out. — chux - Reinstate Monica, May 19 '20 at 22:02

score 2 · Answer 1 · answered Dec 20 '17 at 18:18

2

memcpy does not need to be implemented with a function call. A good compiler will inline it and otherwise optimize it.
Constant union objects with static storage duration can be initialized.

answered Dec 20 '17 at 18:18

Eric Postpischil

195,579
13
168
312

This is right, and the extra variables would likely be optimized away. However, for for my specific usage the other solutions are slightly better. – coderforlife Dec 20 '17 at 19:29

dbush · Answer 2 · 2017-12-21T13:18:40.697

2

You can do this with a compound literal, which you can then take the address of and cast:

double d = ((union {unsigned char c[8]; double d; }){ .c={1,0,0,0,0,0,0xf0,0x7f} }).d;
printf("d=%f\n", d);

int i;
printf("d=0x");
for (i=0; i<sizeof(double); i++) {
    unsigned char c = ((unsigned char *)&d)[sizeof(double)-1-i];
    printf("%02x", c);
}
printf("\n");

Here, we have a anonymous literal union containing an array of unsigned char of size 8 and a double. We initialize the array field of the literal and read the double part to initialize the variable.

Output:

d=nan
d=7ff0000000000001

We can clean this up a bit with a macro, and also take care of the endianness:

static_assert(sizeof(double)==8, "unexpected double size");

#if __BYTE_ORDER == __BIG_ENDIAN
#  define DOUBLE_LIT(c1,c2,c3,c4,c5,c6,c7,c8) ((union {unsigned char c[8]; double d; }){ .c={c1,c2,c3,c4,c5,c6,c7,c8} }).d
#elif __BYTE_ORDER == __LITTLE_ENDIAN
#  define DOUBLE_LIT(c1,c2,c3,c4,c5,c6,c7,c8) ((union {unsigned char c[8]; double d; }){ .c={c8,c7,c6,c5,c4,c3,c2,c1} }).d
#else
#  error unknown endianness
#fi

Then we can use it like this:

double d = DOUBLE_LIT(0x7f,0xf0,0,0,0,0,0,1);

Note that the endianness check is system dependent. The above is how it is typically implemented on Linux.

edited Dec 21 '17 at 13:18

answered Dec 20 '17 at 18:19

dbush

205,898
23
218
273

I am getting errors now wherever this is used stating "`taking address of temporary array`". I tried with and without the `&`. Is this because the code is being compiled as C++ instead of C? Seems like that is what is happening according to https://stackoverflow.com/questions/32941846/c-error-taking-address-of-temporary-array. – coderforlife Dec 20 '17 at 19:12
@coderforlife Most likely. C and C++ are two different languages, and this is one of those places where they differ. If you're writing C code, then compile as C. – dbush Dec 20 '17 at 19:14
Sadly, this is actually running through something else that forces the compile-time options to treat it as C++ (I wish it didn't...). I will see if I can do something about it, but otherwise this does seem to work in my testing code. – coderforlife Dec 20 '17 at 19:19
also be careful with the alignment of the array in compound literal – phuclv Dec 21 '17 at 10:21
Those cast and dereference pairs are strict aliasing violations, and undefined behaviour. – user694733 Dec 21 '17 at 10:32
@user694733 Good point. Modified to use a union for proper alignment and to prevent aliasing issues. – dbush Dec 21 '17 at 13:14
The endian of an integer is often the same as the endidan of the FP types - although not required to be so. The `__BYTE_ORDER == __BIG_ENDIAN` is not an integer big/little issue as much as a `FP_ENDIAN` one. – chux - Reinstate Monica May 19 '20 at 22:05

Grzegorz Szpetkowski · Answer 3 · 2017-12-21T09:54:35.680

One possibility is GCC's (non-portable) __builtin_nan extension, which can be used to produce compile-time NaN constant with the payload.

Referring to its documentation:

This function, if given a string literal all of which would have been consumed by strtol, is evaluated early enough that it is considered a compile-time constant.

Example:

#include <stdio.h>

#define MAKE_QNAN_WITH_PAYLOAD(sign, payload) \
    sign __builtin_nan(#payload)

double d = MAKE_QNAN_WITH_PAYLOAD(-, 0x55550001);

int main(void)
{
    // assume little-endian byte ordering
    for (int i = sizeof(double)-1; i >= 0; i--)
    {
        printf("%.2x", ((unsigned char *)&d)[i]);
    }
    putchar('\n');

    return 0;
}

Result:

fff8000055550001

This is a great solution as it has built-in handling of all the strange things like where the quiet bit goes (or if it is actually a signal bit), endianness, and what-not. Additionally, reading the doc led me to the C99 function nan() which works similarly (but not compile-time) which I am using as a fallback for non-GCC compilers. — coderforlife, Dec 20 '17 at 19:27

C Double Literal NaN with Payload

3 Answers3