Portability of binary serialization of double/float type in C++

Question

The C++ standard does not discuss the underlying layout of float and double types, only the range of values they should represent. (This is also true for signed types, is it two's compliment or something else)

My question is: What the are techniques used to serialize/deserialize POD types such as double and float in a portable manner? At the moment it seems the only way to do this is to have the value represented literally(as in "123.456"), The ieee754 layout for double is not standard on all architectures.

If you need file storage, HDF5 or NetCDF greatly help. – Anycorn Jan 19 '11 at 08:34 — Anycorn, Jan 19 '11 at 08:34
Check https://github.com/fmtlib/fmt – Evgen Bodunov Aug 09 '22 at 10:07 — Evgen Bodunov, Aug 09 '22 at 10:07

score 29 · Answer 1 · answered Jan 19 '11 at 09:28

29

Brian "Beej Jorgensen" Hall gives in his Guide to Network Programming some code to pack float (resp. double) to uint32_t (resp. uint64_t) to be able to safely transmit it over the network between two machine that may not both agree to their representation. It has some limitation, mainly it does not support NaN and infinity.

Here is his packing function:

#define pack754_32(f) (pack754((f), 32, 8))
#define pack754_64(f) (pack754((f), 64, 11))

uint64_t pack754(long double f, unsigned bits, unsigned expbits)
{
    long double fnorm;
    int shift;
    long long sign, exp, significand;
    unsigned significandbits = bits - expbits - 1; // -1 for sign bit

    if (f == 0.0) return 0; // get this special case out of the way

    // check sign and begin normalization
    if (f < 0) { sign = 1; fnorm = -f; }
    else { sign = 0; fnorm = f; }

    // get the normalized form of f and track the exponent
    shift = 0;
    while(fnorm >= 2.0) { fnorm /= 2.0; shift++; }
    while(fnorm < 1.0) { fnorm *= 2.0; shift--; }
    fnorm = fnorm - 1.0;

    // calculate the binary form (non-float) of the significand data
    significand = fnorm * ((1LL<<significandbits) + 0.5f);

    // get the biased exponent
    exp = shift + ((1<<(expbits-1)) - 1); // shift + bias

    // return the final answer
    return (sign<<(bits-1)) | (exp<<(bits-expbits-1)) | significand;
}

answered Jan 19 '11 at 09:28

Sylvain Defresne

42,429
12
75
85

6

it shouldn't be hard to include NaN, infinity and denormalized numbers if you need them. Moreover this code is public domain, which makes it a great answer. – Alexandre C. Jan 19 '11 at 09:53
3

Would an `frexp`-based approach be consistently faster than repeated floating point division / multiplication? `frexp` gives you `exp` and `fnorm` in a single call. Keep in mind IEEE 754 double has 11 bits worth of exponent so you could be dividing / multipying by 2 several hundred times. – jw013 Jan 04 '13 at 22:47
1

@jw013 What would an `frexp`-based approach look like in this situation? I'm struggling with floating-point serialization now, and while the `frexp` approach seems interesting, I can't figure out how to convert the mantissa (which is between 0.5 and 1) to the series of bits representing the significand in an IEEE float or double. Is there an efficient and portable way to do that? – Ricky Stewart Jan 11 '14 at 07:17
Can someone expain to me how `significand = fnorm * ((1LL< – abhiarora Jul 23 '19 at 18:31
While this is an excellent answer it should be noted that this solution might be dangerous if `infinity` can occur as a value for `f`: In that case the first while-loop would never terminate and block the program. – n0dus Feb 03 '23 at 14:55

score 7 · Answer 2 · edited Jul 14 '14 at 12:07

7

What's wrong with a human readable format.

It has a couple of advantages over binary:

It's readable
It's portable
It makes support really easy
(as you can ask the user to look at it in their favorite editor even word)
It's easy to fix
(or adjust files manually in error situations)

Disadvantage:

It's not compact
If this a real problem you can always zip it.
It may be slightly slower to extract/generate
Note a binary format probably needs to be normalized as well (see htonl())

To output a double at full precision:

double v = 2.20;
std::cout << std::setprecision(std::numeric_limits<double>::digits) << v;

OK. I am not convinced that is exactly precise. It may lose precision.

edited Jul 14 '14 at 12:07

Étienne

4,773
2
33
58

answered Jan 19 '11 at 08:43

Martin York

257,169
86
333
562

6

Additional disadvantage: It is not precise. The importance of this can vary greatly between applications. – Magnus Hoff Jan 19 '11 at 08:45
@Magnus Hoff: In what way is it not precise? If you output the value with all precision and the destination uses the same precision then you will loose nothing. If the destination does not use the same precision then all bets are off, but there is no loss between this and a binary format. – Martin York Jan 19 '11 at 08:49
1

+1 even if there may be other disadvantages: it is more expensive to generate/parse --will only affect performance in applications that mostly read/write data, but still. Size does affect there too, and zip-ping will make the performance worse even... Still, a good solution in *almost all* real world cases 99.9% of the time. – David Rodríguez - dribeas Jan 19 '11 at 08:51
9

@Martin: Literal representation is very slow to decode, I'm working on a system that processes very very large time-series and compact, precise and high-speed decodable representations are a must - portability is important as well. – Jan 19 '11 at 08:59
1

@Martin: Hm. I don't think I've ever witnessed a formatting function that can be configured to write out all the precision for a floating point number. If it exists, then there is of course no loss. So my concern is kind of related to the "It's not compact"-disadvantage: You end up with a trade-off between a reasonably sized representation and a precise one. (Again, the importance of either of these vary between applications) – Magnus Hoff Jan 19 '11 at 09:01
1

@Martin: Ah, `numeric_limits`. Unfortunately, there is still precision loss with the use of this for some corner cases. See http://codepad.org/WbACWihl – Magnus Hoff Jan 19 '11 at 09:26
@Magnus: you can output the exact double value using %a printf conversion. Although the output is notexactly human readable. – Maxim Egorushkin Jan 19 '11 at 10:25
1

@Zenikoder: it is a POSIX requirement. http://www.opengroup.org/onlinepubs/000095399/functions/printf.html – Maxim Egorushkin Jan 19 '11 at 12:54
2

@Maxim: So what you're saying is that it wont work on windows or the current C++ standard. – Jan 19 '11 at 20:21
@Zenikoder: I must confess, Linux and g++ spoiled me. :) – Maxim Egorushkin Jan 19 '11 at 23:31
1

For 4 bytes, you can store exactly __4__ characters in human readable form (`17.3` -- that's it.). Compare with the precision and range of values you can get with those same 4 bytes in ieee float format. – bobobobo Sep 25 '12 at 16:47
Other disadvantage: it might not match what you want to use and do not have control over. – spectras Nov 28 '21 at 23:24
@spectras That is true for all formats (its not specific to human-readable text based formats). – Martin York Nov 29 '21 at 19:56

score 5 · Answer 3 · answered Nov 07 '11 at 15:26

Take a look at the (old) gtypes.h file implementation in glib 2 - it includes the following:

#if G_BYTE_ORDER == G_LITTLE_ENDIAN
union _GFloatIEEE754
{
  gfloat v_float;
  struct {
    guint mantissa : 23;
    guint biased_exponent : 8;
    guint sign : 1;
  } mpn;
};
union _GDoubleIEEE754
{
  gdouble v_double;
  struct {
    guint mantissa_low : 32;
    guint mantissa_high : 20;
    guint biased_exponent : 11;
    guint sign : 1;
  } mpn;
};
#elif G_BYTE_ORDER == G_BIG_ENDIAN
union _GFloatIEEE754
{
  gfloat v_float;
  struct {
    guint sign : 1;
    guint biased_exponent : 8;
    guint mantissa : 23;
  } mpn;
};
union _GDoubleIEEE754
{
  gdouble v_double;
  struct {
    guint sign : 1;
    guint biased_exponent : 11;
    guint mantissa_high : 20;
    guint mantissa_low : 32;
  } mpn;
};
#else /* !G_LITTLE_ENDIAN && !G_BIG_ENDIAN */
#error unknown ENDIAN type
#endif /* !G_LITTLE_ENDIAN && !G_BIG_ENDIAN */

glib link

score 4 · Answer 4 · answered Jan 19 '11 at 09:36

4

Just write the binary IEEE754 representation to disk, and document this as your storage format (along with is endianness). Then it's up to the implementation to convert this to its internal representation if necessary.

answered Jan 19 '11 at 09:36

TonyK

16,761
4
37
72

score 2 · Answer 5 · answered Jan 19 '11 at 09:26

Create an appropriate serializer/de-serializer interface for writing/reading this.

The interface can then have several implementations and you can test your options.

As said before, obvious options would be:

IEEE754 which writes / reads the binary chunk if directly supported by the architecture or parses it if not supported by the architecture
Text: always needs to parse.
Whatever you else you can think of.

Just remember - once you have this layer, you can always start with IEEE754 if you only support platforms that use this format internally. This way you'll have the additional effort only when you need to support a different platform! Don't do work you don't have to.

score 1 · Answer 6 · answered Jan 19 '11 at 08:33

1

You should convert them to a format you will always be able to use in order to recreate your floats/doubles.

This could use a string representation or, if you need something that takes less space, represent your number in ieee754 (or any other format you choose) and then parse it as you would do with a string.

answered Jan 19 '11 at 08:33

peoro

25,562
20
98
150

2

Are there any libraries that take a double and convert into a specific binary format? at the moment all we're doing is writing the in-memory layout to disk which is ok, but in a heterogenous environment it wont work out quite as well. – Jan 19 '11 at 08:54
I guess there are some, but I don't know of any, sorry. – peoro Jan 19 '11 at 08:59

score 0 · Answer 7 · answered Nov 13 '15 at 05:05

The SQLite4 uses a new format to store doubles and floats

It works reliably and consistently even on platforms that lack support for IEEE 754 binary64 floating point numbers.

Currency computations can normally be done exactly and without rounding.

Any signed or unsigned 64-bit integer can be represented exactly.

The floating point range and accuracy exceed that of IEEE 754 binary64 floating point numbers.

Positive and negative infinity and NaN (Not-a-Number) have well-defined representations.

Sources:

https://sqlite.org/src4/doc/trunk/www/design.wiki

https://sqlite.org/src4/doc/trunk/www/decimal.wiki

score 0 · Answer 8 · answered Jan 19 '11 at 09:45

0

I think the answer "depends" on what your particular application and it's perfomance profile is.

Let's say you have a low-latency market data environment, then using strings is frankly daft. If the information you are conveying is prices, then doubles (and binary representation of them) really are tricky to work with. Where as, if you don't really care about performance, and what you want is visibility (storage, transmission), then strings are an ideal candidate.

I would actually opt for integral mantissa/exponent representation of floats/doubles - i.e. at the earliest opportunity, convert the float/double to a pair of integers and then transmit that. You then only have to worry about the portability of integers and well, various routines (such as the hton() routines to handle conversions for you). Also store everything in your most prevalent platform's endianess (for example if you're only using linux, then what's the point of storing stuff in big endian?)

answered Jan 19 '11 at 09:45

Nim

33,299
2
62
101

2

market data is a bad example: retrieving the market data is usually more expensive than parsing a bunch of strings. It depends on your technology, but usually such things are stored in a database. – Alexandre C. Jan 19 '11 at 09:51
@Alex, eh? I think you may have misunderstood me, when I'm talking about low-latency environments, I'm not talking about historical data - which could be in DBs, but trading environments where every microsecond counts - in those, do you really want to add extra delay in string conversion routines? `atoi()`, `scanf()`, `sprintf()`, whatever are comparatively slow... – Nim Jan 19 '11 at 10:54
I think you should buy faster hardware then (ie. faster memory). String processing is quite fast CPU wise, much faster than fetching the string from memory... – Alexandre C. Jan 19 '11 at 14:02
@Alex, haha... you can throw more hardware at the problem, but it won't go away, you just delay the inevitable... so, if you don't process a string, you then don't have to fetch it, I'd say that's a huge saving then... ;) – Nim Jan 19 '11 at 14:55
Converting a string to a double is hundreds of times slower than doing arithmetic with doubles on many systems. If you're sitting on the edge of what is and isn't computationally feasible, use of string representations might easily push you over. – Stephen Canon Jan 21 '11 at 16:20

score 0 · Answer 9 · answered Dec 05 '18 at 10:44

Found this old thread. One solution which solves a fair deal of cases is missing - using fixed point, passing integers with a known scaling factor using built-in casts in either end. Thus, you don't have to bother with the underlying floating point representation at all.

There are of course drawbacks. This solution assumes you can have a fixed scaling factor and still get both the range and resolution needed for the particular application. Furthermore, you convert from your floating point to fixed point at the serialization end and convert back at deserialization, introducing two rounding errors. However, over the years I have found fixed point is enough for my needs in almost all cases and it is reasonably fast too.

A typical case for fixed point would be communication protocols for embedded systems or other devices.

Portability of binary serialization of double/float type in C++

9 Answers9

Linked