4

I'm stuck on a homework assignment; I need to convert a binary float to a decimal fraction. I feel like I understand the process, but I'm not getting the right answer. Here's my thought process.

I have the binary float: 0 000 101

  • The bias for a 3-bit exponent field is 3: 2^(3-1)-1 = 3
  • The mantissa becomes 1.101 (base 2)
  • The value of the exponent bits, 0, minus the number of exponent bits, 3, is -3, so the decimal of the mantissa gets moved left 3 places
    0.001101
  • In base-10, that is 2^-3 + 2^-4 + 2^-6, which equals 0.203125 or 13/64.

However, 13/64 is not the correct answer, the auto-grader doesn't accept it. If my answer is wrong, then I don't understand why, and I'm hoping someone can point me in the right direction.

By pure luck I guessed 5/32 as the answer and got it correct; I have no idea why that's the case.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Grav
  • 461
  • 6
  • 18
  • what does it say the right answer is? – old_timer Sep 09 '17 at 01:54
  • another way to look at this 0.001101 right or wrong is the same as 1101/2^6 or 13/64. so your exponent must be wrong? knowing the right answer should sort that out yes? – old_timer Sep 09 '17 at 01:55
  • 1
    By pure luck I guessed 5/32 as the answer and got it correct; I have no idea why that's the case – Grav Sep 09 '17 at 02:59
  • 2
    Perhaps the format doesn't have a hidden leading 1 bit? Then the mantissa is 1.01 and the value is 1.01 x 2^-3 which is 5/32. – prl Sep 09 '17 at 03:41
  • Did your homework assignment tell you that this was an IEEE754 style of format? IDK if there are others that use a biased exponent but have a different meaning for denormals. (Your question already makes it clear that you are using a biases exponent the way IEEE754 does, so it can't be a completely different format. Fun fact: C doesn't require that FP types use IEEE formats at all. But you tagged this `[assembly]`, so I guess it's for a teaching ISA with 7 bit IEEE-style floats? That's odd, why not 8-bit IEEE https://en.wikipedia.org/wiki/Minifloat?) – Peter Cordes Sep 09 '17 at 03:53
  • 1
    We have two different assignments, one for 7 bit floats and one for 8 bits. I figured I'd start with the 7 bits. The homework doesn't mention anything about different formats; I'm brand new to this stuff, but I know we'll be writing ARM64 instructions later in the semester, I don't know if that makes more sense. – Grav Sep 09 '17 at 04:06
  • @Grav: Yeah, 8 and 7 bit float make sense to teach you about the format. Sound like a good approach. But hopefully your class learned something about FP formats that included denormals! Otherwise this is real gotcha assignment. :P But yes, ARM64 uses the IEEE754 floating point, exactly like you're learning here. (Fun fact, ARM NEON (simd vectorized floating point) doesn't support denormals: it flushes them to zero. https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html). Apparently ARM64 does support an IEEE fp16 format: https://gcc.gnu.org/onlinedocs/gcc/Half-Precision.html – Peter Cordes Sep 09 '17 at 04:57
  • @prl: Good idea, that would also explain it. If we knew the value of any other bit patterns, (with a non-zero exponent), we could tell whether this was an explicit-1 "pseudo-denormal" or a hidden-bit denormal. (@Grav: see https://en.wikipedia.org/wiki/Extended_precision#x86_extended_precision_format for a detailed bit-diagram and table for a floating point format that has an explicit leading bit. Specifically, the 80-bit format used internally by legacy x87 floating point instructions, and as `long double` in some C implementations on x86. For example, gcc) – Peter Cordes Sep 09 '17 at 06:36
  • There are other ways to represent the same value which would be more canonical than a zero exponent field but leading bit set in the mantissa. At least in the x87 case, for pseudo-denormals, wikipedia says. "The 80387 and later properly interpret this value but will not generate it." So again, it's still a corner case for the format. – Peter Cordes Sep 09 '17 at 06:39

1 Answers1

4

In IEEE-754 floating-point formats, exponent = 0 is a denormal, where the implied leading bit in the mantissa is 0.

Wikipedia has a good detailed article on the single-precision float (binary32) format, with lots of examples. For binary32 float, the formulas are (from the wiki article):

(−1)^signbit × 2^(−126)        × 0.significandbits   ; denormal, expbits=0
(−1)^signbit × 2^(expbits−127) × 1.significandbits   ; normal
 Inf  or  NaN (depending on mantissa aka significant); expbits = all 1s

(Note that 0.0 is a special case of denormal, but is not actually considered a denormal).

Anyway, with zero exponent, notice that the exponent is no longer expbits - bias, it's one higher.


Back to your case: your mantissa is 0.101 binary, 0.625 decimal (I plugged 0b101 / 8 into calc).

2^-2 * 0.101(binary) = 2^-2 * 0.625(decimal) = 0.15625 = 5/32


There's a https://en.wikipedia.org/wiki/Minifloat wikipedia article, which mentions (with examples) an 8-bit IEEE format, as well as some other less-than-32-bit formats used in real life on computer-graphics hardware. (e.g. 24-bit or 16-bit). Fun fact: x86 can load/store vectors of 16-bit half-precision floats, converting to/from single in registers on the fly with the F16C ISA extension.

See also this online converter with check-boxes for bits: https://www.h-schmidt.net/FloatConverter/IEEE754.html

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847