IEEE single precision floating-point format

Question

Why doesn't the double precision format simply double the bits in each field, rather than only doubling the fraction bits? Also, what is the hidden bit and why is it used?

There are no hidden bits in the spec. And you really want two sign-bits? — Deduplicator, Nov 18 '14 at 01:07
@Deduplicator He means the implicit 1 bit that is assumed when the biased exponent is in the range [-127, 126]. That is, 2^(e-bias)*1.mmmmmmmmm, where that 1 is the hidden bit. — Iwillnotexist Idonotexist, Nov 18 '14 at 01:08
For one, there's no need to double the sign bit. For seconds, the exponent in binary32 is for all intents and purposes between -126 and +127, allowing for values ranging in magnitude up to 10^38. Doubling the width of the exponent field would allow squaring the exponent's magnitude; A value of 10^(38^2) = 10^1444 is so large as to not be meaningful, and those top exponent bits are never going to be used in practice. This is why fewer new bits are given to the exponent and more to the mantissa, where they are more sorely needed. — Iwillnotexist Idonotexist, Nov 18 '14 at 01:14
it is a waste of space to include a bit that doesnt actually have to be there. as far as the other questions, exponent vs mantissa is a tradeoff, perhaps they had to compromise on single but on double they went with a balance they preferred. — old_timer, Nov 18 '14 at 01:15
As for why 11 bits were chosen, an interview with Kahan (he of Intel 8087 fame) [here](http://www.cs.berkeley.edu/~wkahan/ieee754status/754story.html) reads in part: _Quickly, the choice narrowed down to two proposals. The existing DEC VAX formats, inherited from the PDP-11, had the advantage of a huge installed base. But DEC's original double precision 'D' format had the same eight exponent bits as had its single precision 'F' format. This exponent range had turned out too narrow for some double precision computations._ — Iwillnotexist Idonotexist, Nov 18 '14 at 01:53
_DEC reacted by introducing its 'G' double precision format with an 11 bit exponent that had served well enough in CDC's 6600/7600 machines for over a decade; K-C-S had chosen that exponent range too for its double precision._ — Iwillnotexist Idonotexist, Nov 18 '14 at 01:54
Can the close votes please be withdrawn? The questions are clear and there exists a documented reason for the selection of the exponent ranges. A nicer form of the interview with Kahan: http://www.dr-chuck.com/dr-chuck/papers/columns/r3114.pdf — Iwillnotexist Idonotexist, Nov 18 '14 at 02:31

score 3 · Answer 1 · answered Nov 18 '14 at 01:09

3

Your assessment isn't quite true:

IEEE754 16-bit float: 1 sign bit, 5 exponent bits, 10(+1) significand bits, exp &in; [-14, 15]
IEEE754 32-bit float: 1 sign bit, 8 exponent bits, 23(+1) significand bits, exp &in; [-126, 127]
IEEE754 64-bit float: 1 sign bit, 11 exponent bits, 52(+1) significand bits, exp &in; [-1022, 1023]
IEEE754 80-bit float: 1 sign bit, 15 exponent bits, 64(+0) significand bits, exp &in; [-16382, 16383]

So nobody is quite getting doubled. More precision is presumably more useful than a wider range. Recall that the range of representable values increases ... exponentially in the size of the exponent.

answered Nov 18 '14 at 01:09

Kerrek SB

464,522
92
875
1,084

1

1) Is there really a IEEE754 80-bit float? There are [binary and decimal](http://en.wikipedia.org/wiki/IEEE_floating_point#Basic_and_interchange_formats), although I know of no implementations using the decimal ones. – chux - Reinstate Monica Nov 18 '14 at 02:23
2

@chux IEEE754 does not specify an 80-bit float format. It specifies only binary32, binary64, binary128, decimal64 and decimal128. However, it does explicitly allow IEEE754 implementations to extend the standard with extended and extendable types. – Iwillnotexist Idonotexist Nov 18 '14 at 02:28

IEEE single precision floating-point format

1 Answers1