converting doubles to long doubles, understanding of 'binary vs. decimal' ambiguity? and is there a standard solution?

Question

[edit 2021-09-26]

sorry!, I have to admit that I asked crap here, explanation follows. I don't think I should post this as an 'answer', so as an edit:

I'm still curious how a 'double' value of 0.1 converts to a long double!

But the focus of the question was that a spreadsheet program that calculates with 'doubles' stores values in such a way that a program that calculates with better precision reads them in incorrectly. I have now - only now, me blind :-( - understood that it does NOT! store a 'double' binary value, but a string!

And in this gnumeric makes one of the very few mistakes that program makes, it goes with fixed string lengths and stores '0.1' as
'0.10000000000000001', rounded up from
'0.10000000000000000555xx'. LO Calc and Excel store - I think better - the shortest string that survives a roundtrip 'bin -> dec -> bin' unharmed, namely '0.1'. And this works also as interchange to programs with better precision.

So this question is cleared, the problem is not 'solved', but I can work around it.

still curious: would, and if yes with which steps will a double:
0 01111111011 (1).1001100110011001100110011001100110011001100110011010
be converted to a (80-bit) long double:
0 011111111111011 1.10011001100110011001100110011001100110011001100110**10** **00000000000**
or if, and if with which (other) steps it could be made to:
0 011111111111011 1.10011001100110011001100110011001100110011001100110**01** **10011001101**

[/edit]

original question:

Bear with me, this question(s) must be old, but I didn't yet find an answer ... me blind?,

The question in short:

Is there any CPU, FPU switch, command, macro, library, trick or optimized standard code snippet which doe's: 'Converting a double to a long double value (having better precision!) and keep the corresponding 'decimal value'! rather than the 'exact but deviating' 'bit value'?

[edit 2021-09-23]

i found something which might do the job, can anyone propose how to 'install' that and which functions inside to 'call' to use it in other programs (debian linux system)?

Ulf (ulfjack) Adams announced a solution for such problems (for printouts?) in his 'ryu' project 'https://github.com/ulfjack/ryu'. he commented:

'## Ryu
Ryu generates the shortest decimal representation of a floating point number that maintains round-trip safety. That is, a correct parser can recover the exact original number. For example, consider the binary 32-bit floating point number 00111110100110011001100110011010. The stored value is exactly 0.300000011920928955078125. However, this floating point number is also the closest number to the decimal number 0.3, so that is what Ryu outputs.'

(IMHO it should read 'the closest IEEE float number to')

he announced the algo as 'being fast' also, but may be 'fast' compared to other algos computing 'shortest' is not the same as 'fast' compared to computing a fixed length string?

[/edit]

Let's say I have a spreadsheet, and that has stored values in double format, among them values which deviate from their decimal correspondent due to 'not exactly representable in binaries'.
E.g. '0.1', I might have keyed it in as '0.1' or given a formula '=1/10', the stored 'value' as 'double' will be the same:
0 01111111011 (1).1001100110011001100110011001100110011001100110011010 which is appr. 0.10000000000000000555112~ in decimal.

Now I have tuned my spreadsheet program a little, it now can work with 'long doubles'. (I really! did that, it's gnumeric, don't try such with MS Excel or LibreOffice Calc!). 80 bit format on my system as well as on most Intel hardware (1 bit sign, 15 bit exponent, 64 bit mantissa with the leading '1' from normalization stored in the bits! (not 'implicit' and 'left of' as in 'doubles')).

In a new sheet I can happily key in either '0.1' or '=1/10' and get (estimated, couldn't test):
0 011111111111011 1.100110011001100110011001100110011001100110011001100110011001101 being 0.100000000000000000001355253~ in decimals, fine :-)

If I open my 'old' file the 'formula'! will be reinterpreted and show the more precise value, but the 'value'!, the '0,1'!, is not! re-interpreted. Instead - IMHO - the bits from the double value are put into the long structure, build a mantissa like 1.1001100110011001100110011001100110011001100110011010**00000000000**
fully preserving the round-on error from decimal -> binary(double) conversion, producing as decimal representation again:
0.10000000000000000555112~

[edit 2021-09-23]

not finally dived into ... looks as if in some cases the store and read works with strings, sometimes 'longer strings' getting the 00555112~ back, and in other situations stores a rounded string 0,10000000000000001 and the 'long' version generates 0,100000000000000010003120 when loading, even worse.

[/edit]

As said in the subject it's an ambiguous situation, one can either exactly preserve the value given by the double bits, or! interpret it as a 'rounded placeholder' and try to get it's 'originally intended decimal value' back, but not both together. I am playing with 'keep decimal value', can! do such e.g. by specific rounding, but that's complex and costly - in terms of computation effort.

As I have seen the IEEE, CPU and library developers as high skilled persons in the last weeks, having wisely foreseen and implemented solutions for similar problems:

Is there any 'standard' method, CPU, FPU or compiler switch, or optimized code snippet doing such?

Converting a double to a long double value (having better precision!) and keeping the corresponding decimal value instead of the deviating 'bit value'?

If 'no', has anyone delved deeper into such issue and has any good tips for me?

best regards,

b.

A `double` does not have a “corresponding decimal value”. There is no information in a `double` that says the user originally typed in “0.1” and not “0.1000000000000000055511151231257827021181583404541015625”. If you want to add some assumption, such as that the user never typed more than ten significant digits, then converting the `double` that resulted from it to the `long double` that would result from the same numeral, then the solution is easy: Convert the `double` to decimal with ten significant digits (e.g., in C, `sprintf` with `%.10g`), then convert to `long double` (`strtold`). — Eric Postpischil, Sep 21 '21 at 12:14
However, that assumption will be wrong. Users sometimes enter longer numerals. — Eric Postpischil, Sep 21 '21 at 12:14
thanks @Eric, 'sprintf and strtold' - am I right that such is 'string math' and quite costly reg. performance? 'rounding' would be faster? question is if there is anything even better? 'sometimes enter ...' - yes, of course, but! i can be sure that they didn't type '0.10000000000000000555112' for a double, or that if! they did such ... the sheet / conversion didn't accept it, calculated everything below 0.1~125xxx to '0.10~0000000' and substituted that with the 'nearest' 0.1~555111... and with that conclusion i can cut the overshot, question is which is the best way ... — user1018684, Sep 21 '21 at 13:35
have to add another point ... think i remember that the 'decimal value' of a float, double etc. is (regarding that there are multiple probably infinite longer strings doing the same) 'the **shortest** decimal string producing that same binary when converted back to binary representation'??? in that sense a binary value has! a corresponding decimal value (one!, at most two for rare cases of exact midpoints, for them IEEE defaults to the binary even ('0' as last digit), thus it's only one!) 'corresponding decimal', and everything with 0.10~0055xx' or similar would be wrong. ??? — user1018684, Sep 22 '21 at 17:16
Finding the decimal numeral nearest a binary floating-point number and vice-versa is a complicated problem. It is “simple” in that it can be done with elementary school mathematics, just carrying out digits to as many decimal places as needed. However, since numbers in the `double` format can exceed 10^308, that can require hundreds of digits. So good modern binary-to-decimal and decimal-to-binary routines used advanced algorithms published in academic papers. Researchers have worked out ways to work with numbers like, for example, 1.23456789e308 without computing everything from scratch… — Eric Postpischil, Sep 23 '21 at 13:00
… Those algorithms are built into good libraries, such as the `printf` and `scanf` in the standard C library. Those libraries work with character forms of decimal because they are intended for input and output to character formats. I am not aware of implementations that directly take or produce a decimal floating-point format. Perhaps there are some if you look around. Failing that, the way to implement an efficient decimal-to-binary conversion would be to take those published algorithms and implement them yourself, producing decimal output instead of decimal character output and so on. — Eric Postpischil, Sep 23 '21 at 13:03
For the purpose you describe, reading old spreadsheet data and converting it to a new format, converting through character formats should suffice. That is because it is a one-time or occasional process, so it does not need high performance. Unless you have many thousands, perhaps millions, of spreadsheets to convert, then the resources invested in implementing new routines will exceed the cost of converting by using the existing `sprintf` and `strold` routines. — Eric Postpischil, Sep 23 '21 at 13:05
hello @Eric, thanks for our input. you are right, it's 'not easy' - 'modern routines': may be i found something, see edit in question - 'printf and scanf': i'll look, just not aware where it's computed in gnumeric :-( - 'decimal vs. decimal character': i'm not searching for 'decimal output', but for a bin-value preserving the 'decimal value' as 'nearest representation' - i have two scenarios to think over: 'double' programs could write 'shortest' and thus less confusing output, and / or 'long' programs could improve in interpreting the input being aware that it origins from 'doubles'. — user1018684, Sep 23 '21 at 14:54
It doesn't answer your question directly, but when you truly need decimal arithmetic, without the anomalies induced by decimal-to-binary conversions and back, there are decimal floating-point formats [`decimal32`](https://en.wikipedia.org/wiki/Decimal32_floating-point_format), [`decimal64`](https://en.wikipedia.org/wiki/Decimal64_floating-point_format), [`decimal128`](https://en.wikipedia.org/wiki/Decimal128_floating-point_format) that use base 10 internally. gcc, at least, is starting to have support for these. — Steve Summit, Sep 24 '21 at 12:47
@Steve Summit: :-) again you made my day! (positive!) ROFL! and can't help out. 'is starting to have support ...' !!! - IEEE 854 is about 34 years old, and simply ignored in 'the flow of binary evolution'. I tried asking for / proposing it, answers: 'no, NO, NONO, NO!NO!NO!NO! never!, performance!!!, support from compilers and libraries?!?, COMPATIBILITY!!! NO!!! NO!!! NO!!! ... or maybe in a very distant future'. my 'project' is not! about measuring distance to the moon in micrometers, but giving all of us '=0.1+0.2' -> '0.3' back, for all decimals, reliable math, calculated with floats. — user1018684, Sep 26 '21 at 08:07

Steve Summit · Answer 1 · 2021-09-29T13:24:27.893

I don't think I can give you a definitive answer to your question(s), but there's more to say than will fit in comments, so here goes.

still curious: would, and if yes with which steps will a double: 0 01111111011 (1).1001100110011001100110011001100110011001100110011010 be converted to a (80-bit) long double: 0 011111111111011 1.10011001100110011001100110011001100110011001100110**10** **00000000000**

If I'm understanding your question correctly, I believe the answer is: "always". As far as I know, when converting a floating-point type to another floating-point type with more precision, the extra precision is always filled in with 0. I tested that hypothesis with this program:

#include <stdio.h>

#define LDSIZE 10
typedef unsigned char uchar;

int main()
{
    int i;
    unsigned char xbuf[16];
    float f = 0.1;
    double d1 = f;
    double d2 = 0.1;
    long double ld1 = f;
    long double ld2 = d1;
    long double ld3 = d2;
    long double ld4 = 0.1L;

    printf("  f = %.30f\n", f);
    printf(" d1 = %.60f\n", d1);
    printf(" d2 = %.60f\n", d2);
    printf("ld1 = %.72Lf\n", ld1);
    printf("ld2 = %.72Lf\n", ld2);
    printf("ld3 = %.72Lf\n", ld3);
    printf("ld4 = %.72Lf\n", ld4);

    printf("\n");

    printf("  f = ");
    for(i = sizeof(float)-1; i >= 0; i--) printf("%02x", ((uchar *)&f)[i]);
    printf("\n");

    printf(" d1 = ");
    for(i = sizeof(double)-1; i >= 0; i--) printf("%02x", ((uchar *)&d1)[i]);
    printf("\n");

    printf(" d2 = ");
    for(i = sizeof(double)-1; i >= 0; i--) printf("%02x", ((uchar *)&d2)[i]);
    printf("\n");

    printf("ld1 = ");
    for(i = LDSIZE-1; i >= 0; i--) printf("%02x", ((uchar *)&ld1)[i]);
    printf("\n");

    printf("ld2 = ");
    for(i = LDSIZE-1; i >= 0; i--) printf("%02x", ((uchar *)&ld2)[i]);
    printf("\n");

    printf("ld3 = ");
    for(i = LDSIZE-1; i >= 0; i--) printf("%02x", ((uchar *)&ld3)[i]);
    printf("\n");

    printf("ld4 = ");
    for(i = LDSIZE-1; i >= 0; i--) printf("%02x", ((uchar *)&ld4)[i]);
    printf("\n");
}

The output was:

  f = 0.100000001490116119384765625000
 d1 = 0.100000001490116119384765625000000000000000000000000000000000
 d2 = 0.100000000000000005551115123125782702118158340454101562500000
ld1 = 0.100000001490116119384765625000000000000000000000000000000000000000000000
ld2 = 0.100000001490116119384765625000000000000000000000000000000000000000000000
ld3 = 0.100000000000000005551115123125782702118158340454101562500000000000000000
ld4 = 0.100000000000000000001355252715606880542509316001087427139282226562500000

  f = 3dcccccd
 d1 = 3fb99999a0000000
 d2 = 3fb999999999999a
ld1 = 3ffbcccccd0000000000
ld2 = 3ffbcccccd0000000000
ld3 = 3ffbccccccccccccd000
ld4 = 3ffbcccccccccccccccd

It's pretty clear that, in every case, the "new" precision after an upcast is always 0. Only when a variable (f, d2, or ld4) gets a "fresh" initialization from 0.1 does it receive full precision (and for long double, this has to be "0.1L").

or if, and if with which (other) steps it could be made to: 0 011111111111011 1.10011001100110011001100110011001100110011001100110**01** **10011001101**

As far as I know, the answer is "never". There's simply no information which would enable the conversion to fill in anything other than 0.

LO Calc and Excel store - I think better - the shortest string that survives a roundtrip 'bin -> dec -> bin' unharmed, namely '0.1'

Investigating what Excel and other spreadsheets do sounds like a very promising line of inquiry. I've been wondering about Excel, myself. (But what's "LO Calc"? Oh, you must mean Libre Office.)

If I'm understanding correctly, "the shortest string that survives a roundtrip 'bin -> dec -> bin' unharmed" is what you said Ulf Adams's "Ryu" tries to do. That sounds like something I'll want to investigate, as well.

On the foundational question of whether to do floating-point in decimal instead of binary, you wrote in a comment that you had received a chorus of

'no, NO, NONO, NO!NO!NO!NO! never!, performance!!!, support from compilers and libraries?!?, COMPATIBILITY!!! NO!!! NO!!! NO!!! ... or maybe in a very distant future'.

I think here we may have to admit that, whether we like it or not, the rest of the world might be right, and we might be wrong. The rest of the world has been working very hard on binary floating-point for quite some years now, and decimal has been taking a back seat, and I doubt this has been a mistake, there must be some good reasons for it.

The "History" section of the Wikipedia article on IEEE 754-1985 is an interesting read, I find. I can remember working with the VAX floating-point formats in the 1980s, and I was vaguely aware that there was something Intel was doing that was different, but I never realized there were significant technical and philosophical debates over things like "gradual underflow" that lasted for years, and I don't think I knew that what started with the Intel 8087 coprocessor ended up being IEEE-754. But my point is that there was huge attention being paid to these formats, so I really don't think the choice of binary over decimal was an accident. I assume that efficiency, ease of implementation, and compatibility with binary integer arithmetic were the main concerns, although there may have been others as well.

[Footnote: The preceding paragraph may have implied that the difference between VAX floating and Intel/IEEE-754 was decimal vs. binary, but no, of course VAX floating was binary also.]

my 'project' is not! about measuring distance to the moon in micrometers,

But that's part of the answer, of course. Nobody can measure the distance to the moon in micrometers. It's a meaningless concept. There's always inaccuracy in real-world measurements — often multiple kinds of inaccuracy mooshed together. In the case of long distances, we have the orthogonal issues of (a) we may not have a good enough ruler to make the measurements and (b) the distance may change as a function of time and (c) it's usually not even obvious how to define the distance measured: is it from center-of-mass to center-of-mass, or from nearest point (highest mountain) to nearest point, or what?

And since there's inaccuracy in just about every real-world measurement, it Just Doesn't Matter if there's inaccuracy in finite-precision floating-point arithmetic, or incongruities when converting back and forth between decimal and binary.

(My favorite example of an unmeasurable distance is the distance from New York to Los Angeles. You can't even measure it in miles, let alone inches or micrometers. Besides the questions above, there's also (d) are you trying to measure a straight line through the earth, or great circle distance following the curvature of the earth, or the distance you'd travel if you walked a straight line, counting every mountain and valley you had to climb and descend, or shortest distance traveling actual roads, or what?)

but giving all of us 0.1 + 0.2 -> 0.3 back, for all decimals, reliable math

At the end of the day, though, I think there are only two classes of programmers who care about getting 0.1 + 0.2 == 0.3:

people writing accounting software using dollars and cents (or any other decimal currency)
beginning programmers who are learning about floating point for the first time

People writing accounting software have all learned to work in pennies (or, stated another way, to roll their own fixed point arithmetic, not floating point).

Beginning programmers are just taught that "floating point is always imprecise" (even though that's arguably a misleading oversimplification).

Everybody else doesn't care about perfect precision (certainly not perfect decimal precision), and of course everybody does care about efficiency, and the hardware manufacturers all seem to be focusing on binary in pursuit of that, so decimal gets left out in the cold.

And it really is important to be clear-eyed about the precision issue: although precision loss in floating-point arithmetic is a hugely important problem, that people can devote their entire careers to (devising cool and non-obvious workarounds like the "Kahan summation algorithm"), for most real-world programs, the discrepancies introduced by binary ↔ decimal conversions just don't end up being the problem, I think.

Although, I suppose, I made my own oversimplification above: there's probably a third class of programmers who might care about 0.1 + 0.2 ≟ 0.3, and that is:

programmers writing general-purpose spreadsheet or calculator programs

And it sounds like maybe this includes you...

Hi @Steve, nice to read from you, I like your style and sharp analysis, you are one of the "good guys", > 'every measurement in the real world' - yes, we are used to that, but not to inaccuracies in mathematics! Reading on the internet, I see many problems with " irritated users" who are not familiar with FP inaccuracy, and every day new ones are born who grow into this problem. M.E. is not "efficient" to let too many of them encounter problems and then send them on a painful "learning curve". IMHO 0.3000000000004 is not a problem in itself, but 0.1 + 0.2 != 0.3 is, ... — user1018684, Sep 29 '21 at 17:02
and (IMHO) it's quite easy to get rid of that, 0.3000000000004 needs to be taken out of the game ... > 'programmers writing general purpose spreadsheets' - IMHO these have invested lots! of effort to get these things right, unfortunately mostly with questionable concepts and by making things worse outside their current focus. But it is very, very hard to talk to them from a 'newbie wishful thinking' POV, they are a.: 'young, naive, and arrogant' or b.: 'old and frustrated' or c.: on the path from a. to b. That's somehow 'systemic stability', or 'tradition'. — user1018684, Sep 29 '21 at 17:05
hello @Steve Summit, > 'As far as I know, the answer is "never".' - would you mind trying: `long double ld5 = d2 * pow( 10, 16 );` `ld5 = ( ld5 + 0.5 );` `ld5 = (long int)( ld5 );` `ld5 = ld5 / pow( 10, 16 );` — user1018684, Sep 30 '21 at 16:02
hello @Steve Summit, not 'objecting' but just widening scope: the Nobel award winning gravitational wave experiment with the 'LIGO detectors' is said to measure 4 km distance to 10^-18 m precision, that's about 1:2,5E-22 and exceeds the precision of double and even 80-bit long figures (it will fit into a quad and doesn't exceed the span 1 - biggest float!). It exceeds the precision 1/1000 hair vs. earth-moon by far. I doubt they calculated with Excel, Calc or gnumeric ;-) ... but keep in mind: the award is honor and! about 1.000.000 bucks, thus there definitively is! some value in precision. — user1018684, Oct 01 '21 at 13:09
And one more, then it's good for today. @Steve said: 'the rest of the world might be right, and we might be wrong' - i don't say the decision for floats was wrong, what i say is: 1. what modern spreadsheets calculate is far from IEEE capabilities :-( 2. the actual understanding of IEEE capabilities is far from what's really possible :-) 3. with a little smarter design, IEEE's would have spread fewer headaches and their potential would have opened up much sooner. — user1018684, Oct 01 '21 at 13:28

converting doubles to long doubles, understanding of 'binary vs. decimal' ambiguity? and is there a standard solution?

1 Answers1