7

Firstly, I realise that most base 10 numbers cannot be precisely expressed in base 2, and so my question isn't really about the deficiencies of floating point arithmetic.

I am trying to write a function that will attempt to correct a double tainted by cumulative rounding error by checking the last 6 meaningful digits are within some tolerance and changing it to the next representable above some supposed exact value (only for display purposes - unless it is an integer or power of two).

A component of my function that surprises me is the output of exp10 though; As far as I'm aware, so long as the spacing between two doubles is less than 2 then integer values stored as doubles should be exact - and though 10^14 is pushing it, this should be an exact integer (since 10^14 =~ 2^46.507 < 2^53). However this is not what my testing shows.

An excerpt of my debugging efforts (nothing stands out as obvious) and output is as follows:

double test = 0.000699;
double tmp = fabs(test);
double exp = 10.0 - floor(log10(tmp));
double powTen = exp10(10.0 - floor(log10(tmp)));
double powTen2 = exp10(exp);
double powTen3 = exp10((int)exp);
double powTen4 = exp10(exp);
double powTen5 = pow(10, exp);

printf("exp: %.16lf\n", exp);
printf("powTen: %.16lf\n", powTen);
printf("powTen2: %.16lf\n", powTen2);
printf("powTen3: %.16lf\n", powTen3);
printf("powTen4: %.16lf\n", powTen4);

//these two are exact
printf("10^14: %.16lf\n", exp10(14));
printf("powTen5: %.16lf\n", powTen5);
printf("exp == 14.0: %d\n", exp == 14.0);

output:

exp: 14.0000000000000000
powTen: 100000000000000.1250000000000000
powTen2: 100000000000000.1250000000000000
powTen3: 100000000000000.1250000000000000
powTen4: 100000000000000.1250000000000000
10^14: 100000000000000.0000000000000000
powTen5: 100000000000000.0000000000000000
exp == 14.0: 1

pow is getting the answer exact, as is exp10 with a hardcoded int. For all other cases I am adding in 1/8 (the spacing between 10^14 and 10^14 + next representable is 1/64). The documentation says that exp10 should be equivalent to pow. Can anyone see something I'm missing?

Edit - with O3, O2, O1 optimisation I am getting the expected outputs - unless the data cannot be known until runtime. at this point exp10 is still misbehaving.

HexedAgain
  • 1,011
  • 10
  • 21
  • 6
    [What Every Computer Scientist Should Know About Floating-Point Arithmetic](http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html) – Bryan Chen May 14 '14 at 23:03
  • 9
    Thanks I have already been following that article but this behaviour of exp10 is not correct - unless my usage of it is incorrect - I'm not asking why does 0.6 look like 0.5999999999.... + junk or why 0.3 - 0.2 -0.1 ! = 0.0 and so on and so forth... I am asking why what *can* be represented exactly as an integer is not being represented as so with exp10 but *is* with pow – HexedAgain May 14 '14 at 23:04
  • 3
    `exp10(14)` is probably being evaluated by the compiler, which may have different rounding settings. Can't explain the others. – Ben Voigt May 14 '14 at 23:18
  • 2
    BTW, do print the result of `exp == 14.0` please – Ben Voigt May 14 '14 at 23:19
  • good idea Ben ... but unfortunately I am getting the result 1 (I would have been happy if it was 0!) :[ It is an interesting point about the compiler - I shall play around with optimisation flags – HexedAgain May 14 '14 at 23:21
  • and behold - with -O3 I am getting the exact answers! ... grrr :] – HexedAgain May 14 '14 at 23:23
  • 1
    Since these are all compile-time constants, with optimization they're probably all computed during compilation. – Ben Voigt May 15 '14 at 00:20
  • This would seem to be correct, looking at the assembly when everything can be computed then with optimisations the code is identical; letting "test" take it's value from command line arguments it differs - and the result is still incorrect in precisely the same way only with exp10 – HexedAgain May 15 '14 at 03:24
  • 1
    @BryanChen: What every computer scientist should know about floating-point arithmetic, but, for whatever reason, you don't? – tmyklebu May 15 '14 at 04:06
  • Even though integers can be exactly represented in a double, the result of the computation isn't guaranteed to be an integer! Some operations are going to be more accurate than others. – Mark Ransom May 15 '14 at 15:33
  • @BryanChen Where in the document you link does it say that `exp10(14.0)` should be anything other than `100000000000000.0`? – Pascal Cuoq May 15 '14 at 17:24

1 Answers1

7

It is probable that your exp10 implementation is misbehaving. Note that the results it's returning are sometimes off by an ulp (that's 0.125 relative to your 10^14.)

This is a rather heinous error; you've got a case where the correct answer is representable as a double yet exp10 isn't doing so.

I'd echo Ben Voigt's comment that the compiler may sometimes evaluate things itself instead of passing them off to the math library. It's probably doing a better job since it probably links to arbitrary-precision math library. You might experiment with the -fno-builtin option to see whether it changes anything.

Unfortunately, I don't think crlibm has implemented exp10. Otherwise I'd recommend you just use that and stop worrying.

EDIT: The copy of eglibc source I have seems to implement exp10 thus:

double
__ieee754_exp10 (double arg)
{
  /* This is a very stupid and inprecise implementation.  It'll get
     replaced sometime (soon?).  */
  return __ieee754_exp (M_LN10 * arg);
}

Don't expect this to work well.

tmyklebu
  • 13,915
  • 3
  • 28
  • 57
  • Ditto for [glibc](https://sourceware.org/git/?p=glibc.git;a=blob;f=math/e_exp10.c;h=5197d491d54c99150944c70222dfd03c8c37a73b;hb=HEAD). This is probably the cause of the problem. – T.C. May 15 '14 at 04:46
  • A 1 ulp error is "heinous"? How would you have implemented this function to deliver perfect results? – Mark Ransom May 15 '14 at 15:36
  • @MarkRansom: generally speaking, a well-behaved math library tries to deliver results with error < 1ulp for simple functions like `exp`, `log`, etc, even though that isn't required by any standard. This guarantees that cases that have an exact answer produce that answer, which makes it much easier for programmers to use them. (I wouldn't go so far as to call this "heinous", however, just "sloppy"). – Stephen Canon May 15 '14 at 16:10
  • crlibm does not have `exp10()`, but it has `pow()`, which allows to compute a correctly rounded `pow(10, y)` for any double `y` you wish. It is certainly more expensive (just recognizing exact cases for pow()` is a mess), but the OP didn't say that performance was an issue. If performance was an issue, the OP could re-use the various tools that helped build CRlibm (Sollya is one) to build their own function, but this may not yet have gotten as easy as the goal is it should be. – Pascal Cuoq May 15 '14 at 16:50
  • 1
    @MarkRansom 0.5 ULP is “perfect”. 1 ULP leaves plenty of space for intermediate results to each be slightly wrong, as long as you can compute them with slightly extended precision. If I had to implement `exp10` to 1 ULP, I would call a bad quality `exp10l()` (wrong by a few ULPs) and round the result to `double`. If I had to implement `exp10`, it would be much harder. I would have to start by looking up where someone who understands Sollya lives, then kidnap their dog. It would be messy. – Pascal Cuoq May 15 '14 at 16:55
  • 1
    @MarkRansom: I wouldn't go for perfect results, but a 1 ulp error really is heinous. How I'd implement `exp10` to get sub-ulp errors: Find a good approximating polynomial on `[1,2]` to quad precision, evaluate it to extended precision using `double double` techniques, handle the exponent (as a `double double`), and multiply the two together (again as a `double double`). Maybe I'd do a similar trick with the high few bits of the significand as with the exponent. Done right, you can get a guarantee of a maximum of, say, 0.51 ulp. – tmyklebu May 15 '14 at 19:36
  • 3
    You people and your wild overkill. double double? quad? bah. `exp10` has spectacularly well behaved series expansions, and you can easily deliver sub-ulp accuracy via “Gal’s accurate tables” and a Chebyschev, Cathedory-Fejer, or Minimax Polynomial on the residual evaluated in native double precision. Try it, it’s a fun way to spend an afternoon. – Stephen Canon May 15 '14 at 21:38
  • (If you’re going to use head-tail arithmetic, you might as well go all the way and deliver correct rounding; I expect you only need ~150 bits for the worst case of exp10). – Stephen Canon May 15 '14 at 21:43
  • 1
    @StephenCanon: Consider me nerdsniped. – tmyklebu May 16 '14 at 16:29
  • Updated it. The cases near infinity are one gotcha. It also seems that you have to be careful how you find the polynomials; here I integrated a minimax polynomial for the derivative instead of finding a minimax polynomial for 10^x. – tmyklebu May 17 '14 at 00:50
  • @tmyklebu: another common approach is to find a minimax fit of `(10**x - 1)/x`. – Stephen Canon May 17 '14 at 23:49
  • 1
    @MarkRansom I wasn't going to take up the gauntlet, but somehow I managed to trick myself into implementing a nearly-0.5ULP-accurate single-precision sine function using only single-precision computations: http://stackoverflow.com/a/23726810/139746 – Pascal Cuoq May 18 '14 at 21:25
  • 3
    FYI, the worst case of exp10 in double precision (binary64) in round-to-nearest is: exp10(1.A83B1CF779890P-26), whose exact result is 1.000000F434FAAP0:01[60]0101... (where what is after the colon is the sequence of bits following the truncated 53-bit significand, [60] meaning that the preceding bit 1 is repeated 60 times). So, one needs an accuracy of about 114 bits. Full results on: https://www.vinc17.net/research/slides/tamadi2010-10.pdf – vinc17 Jun 28 '14 at 00:33