Why the IEEE-754 exponent bias used in this C code is 126.94269504 instead of 127?

Question

The following C function is from fastapprox project.

static inline float 
fasterlog2 (float x)
{
  union { float f; uint32_t i; } vx = { x };
  float y = vx.i;
  y *= 1.1920928955078125e-7f;
  return y - 126.94269504f;
}

Could some experts here explain why the exponent bias used in the above code is 126.94269504 instead of 127? Is it more accurate bias value?

Compiler warning from `float y = vx.i;` "possible loss of data". — Weather Vane, Jun 26 '15 at 20:37
Why not ask someone who wrote it? It's not really a coding question, it's more of an algorithm question, isn't it? — Dan, Jun 26 '15 at 20:39
@Dan - Yes, but C developers are ones who can read the C code, and they are usually care about such a low-level knowledge. According to my googling, there have been already a lot of C projects using functions implemented in fastapprox, so maybe the answers to this question is also useful for C developers. One more to say, why is asking the author the only way to get the question answered? Here, I might get quicker responses from different experts. — Astaroth, Jun 26 '15 at 21:17

Mr. Llama · Accepted Answer · 2015-06-26T21:11:47.767

9

In the project you linked, they included a Mathematica notebook with an explanation of their algorithms, which includes the "mysterious" -126.94269 value.
If you need a viewer, you can get one from the Mathematica website for free.

Edit: Since I'm feeling generous, here's the relevant section in screenshot form.

Simply put, they explain that the value is "simpler, faster, and less accurate".
They're not using -126.94269 in place of -127, they're using it in place of the result of the following calculation (values rounded for brevity):

-124.2255 - 1.498 * mx - (1.72588 / (0.35201 + mx))

edited Jun 26 '15 at 21:11

answered Jun 26 '15 at 21:00

Mr. Llama

20,202
2
62
115

2

I love it that they use the numbers 1.1920928955078125e-7 and 126.94269504, with 17 and 11 digits' worth of precision, respectively, and then stick an `f` at the end to make them floats! – Steve Summit Jun 26 '15 at 21:20
@SteveSummit that's 17 decimal digits; perhaps they are exactly representable in IEEE754 – M.M Jun 26 '15 at 21:38
@Mr_Llama, I see, thank you very much. It turns out that I was hurriedly reading their website. I didn't notice there was such a Mathematica Notebook. – Astaroth Jun 26 '15 at 21:46
When considering all 2^32 possible `float` encodings, and discarding the ones whose `log2` results are `NAN` or `INF`, the `126.94269504` value is clearly better than `127.0`, with an average absolute error of `0.0129` compared to `0.0303`. This value is not optimal for reducing average error, however - using `126.940` instead further reduces the error to `0.0128`. My guess is that the author is using a different metric than "average error" for optimizing the value. – MooseBoys Jun 26 '15 at 22:02
3

Ah, sure enough, the author added a fudge factor so that results for inputs near powers of two are more accurate, at the expense of overall accuracy. – MooseBoys Jun 26 '15 at 22:12
@MooseBoys While I haven't tested myself if this is what they did, I'll point out that a common metric is "worst error" rather than average. That way you can specify your function as +/- X accuracy. – kylefinn Mar 31 '20 at 00:09

score -4 · Answer 2 · answered Jun 26 '15 at 20:48

-4

Well, no, 126.94269504 is not a "more accurate" bias value. This code is doing something very, very strange; I'm pretty surprised it works at all. It takes the bits of a float as if they were an int (which in my experience usually gives you a totally garbage value, but maybe not), then takes that "garbage" int value and converts it back to a float, then does some math on it. This is, as they say, a fast and approximate way of doing something, in this case, taking the base-2 log. It shouldn't work at all, but the difference between 127 and 126.94269504 is evidently just one of several goofy fudge factors which are intended to salvage some meaning from what ought to be meaningless code. (Sort of a "two almost wrongs make an almost-right" kind of thing.)

If you want to extract exactly the mantissa and exponent of a float (though this will neither be as fast or as approximate), the usual way to do it is with the frexpf function.

answered Jun 26 '15 at 20:48

Steve Summit

45,437
7
70
103

4

-1; it's not garbage if you rely on a particular encoding (namely [IEEE754](https://en.wikipedia.org/wiki/Single-precision_floating-point_format)), and the "some math" it's doing is very deliberate. – MooseBoys Jun 26 '15 at 20:56
3

The exponent part of a floating-point number is logarithmic, while the mantissa part is linear. The code simply seems to apply a linear best fit approximation, where the y-axis offset of the linear equation is combined with the exponent bias. – njuffa Jun 26 '15 at 20:58
2

It is simultaneously an insane and a brilliant technique. As I said, I'm surprised it works at all, but I tried it, and I can see that it does. I can even understand how it works, sort of, but the end result still leaves me with a feeling of, oh, having called `free` twice on the same pointer and having it approximately sort my data for me, or something! – Steve Summit Jun 26 '15 at 21:02

Why the IEEE-754 exponent bias used in this C code is 126.94269504 instead of 127?

2 Answers2

Linked