2

I have the Sign Bit, Exponent and Mantissa (as shown in the code below). I'm trying to take this value and turn it into the float. The goal of this is to get 59.98 (it'll read as 59.9799995)

uint32_t FullBinaryValue = (Converted[0] << 24) | (Converted[1] << 16) |
                            (Converted[2] << 8) | (Converted[3]);

unsigned int sign_bit = (FullBinaryValue & 0x80000000);
unsigned int exponent = (FullBinaryValue & 0x7F800000) >> 23;
unsigned int mantissa = (FullBinaryValue & 0x7FFFFF);

What I originally tried doing is just placing them bit by bit, where they should be as so:

float number = (sign_bit << 32) | (exponent << 24) | (mantissa);

But this gives me 2.22192742e+009.

I was then going to use the formula: 1.mantissa + 2^(exponent-127) but you can't put a decimal place in a binary number.

Then I tried grabbing each individual value for (exponent, characteristic, post mantissa) and I got

Characteristic: 0x3B (Decimal: 59)
Mantissa: 0x6FEB85 (Decimal: 7334789)
Exponent: 0x5 (Decimal: 5) This is after subtracting it from 127

I was then going to take these numbers and just retrofit it into a printf. But I don't know how to convert the Mantissa hexadecimal into how it's supposed to be (powered to a negative exponent).

Any ideas on how to convert these three variables (sign bit, exponent, and mantissa) into a floating number?

EDIT FOR PAUL R Here is the code in Minimal, Complete and Verifable format. I added the uint8_t Converted[4] there just because it is the value I end up with and it makes it runnable.

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
    uint8_t Converted[4];
    Converted[0] = 0x42;
    Converted[1] = 0x6f;
    Converted[2] = 0xEB;
    Converted[3] = 0x85;

    uint32_t FullBinaryValue = (Converted[0] << 24) | (Converted[1] << 16) |
                                (Converted[2] << 8) | (Converted[3]);

    unsigned int sign_bit = (FullBinaryValue & 0x80000000);
    unsigned int exponent = (FullBinaryValue & 0x7F800000) >> 23;
    unsigned int mantissa = (FullBinaryValue & 0x7FFFFF);

    float number = (sign_bit) | (exponent << 23) | (mantissa);

    return 0;
}
Rayaarito
  • 400
  • 4
  • 21
  • 1
    Your shifts are off by one - try: `uint32_t number = (sign_bit << 31) | (exponent << 23) | (mantissa);` and then use a union or whatever to convert this to a `float`. – Paul R Aug 04 '17 at 17:49
  • 1
    If you are going to assume that `double` is implemented according to IEEE 754, it may be a good idea to verify your supposition with [`std::numeric_limits::is_iec559`](http://en.cppreference.com/w/cpp/types/numeric_limits/is_iec559). – François Andrieux Aug 04 '17 at 17:51
  • @PaulR thanks for the quick reply. When I do that, I get `1.11463104e+009` when it should be `59.9799995` – Rayaarito Aug 04 '17 at 17:51
  • I just realised your sign bit was not right-justified, like the exponent, so it would just be: `uint32_t number = (sign_bit) | (exponent << 23) | (mantissa);`. If that still doesn't work then post a [mcve] with the latest version of your code. – Paul R Aug 04 '17 at 17:53
  • FrançoisAndrieux figured it out. Thank you for your effort @PaulR. I appreciate it – Rayaarito Aug 04 '17 at 18:08

1 Answers1

6

The problem is that the expression float number = (sign_bit << 32) | (exponent << 24) | (mantissa); first computes an unsigned int and then casts that value to float. Casting between fundamental types will preserve the value rather than the memory representation. What you are trying to do is reinterpret the memory representation as a different type. You can use reinterpret_cast.

Try this instead :

uint32_t FullBinaryValue = (Converted[0] << 24) | (Converted[1] << 16) |
                           (Converted[2] << 8) | (Converted[3]);


float number = reinterpret_cast<float&>(FullBinaryValue);
François Andrieux
  • 28,148
  • 6
  • 56
  • 87
  • That worked. Nice knowing that I did not have to do all that work. There's always a simpler way, eh? Now it's time to study exactly what a reinterpret_cast does. Thank you very much. Why do you have an `&` at the end of ``? – Rayaarito Aug 04 '17 at 18:08
  • `reinterpret_cast` is a good old fashioned `(float)` cast dressed up in fancy c++ speak. – pm100 Aug 04 '17 at 18:13
  • 1
    @Dave `reinterpret_cast` is used to interpret memory as a specific type, regardless of what the type system may have to say about it. As such, it can only meaningfully produce pointers and references. It doesn't cast values, it produces memory locations as if it was the given type. `reinterpret_cast` will take a reference to whatever you pass to it and pretend it's a reference to a float (thus the `&`). You could also to `reinterpret_cast(&FullBinaryValue);`. But you cannot use it to directly convert between value types, like `reinterpret_cast(FullBinaryValue);`. – François Andrieux Aug 04 '17 at 18:13
  • 1
    @pm100 It's important for this question to distinguish between `(float)FullBinaryValue` and `*((float*)&FullBinaryValue)`. This solution is closer to the latter. – François Andrieux Aug 04 '17 at 18:15
  • 1
    Does this break strict aliasing? – Passer By Aug 04 '17 at 18:19
  • @FrançoisAndrieux So the reinterpret_cast sees the address which points to the binary value and keeps it structures the same but just reads it differently while a normal cast would take the value itself and say "we're going to keep your value the same but change your format? Is there an equivalent for if I wanted to use it in C ? – Rayaarito Aug 04 '17 at 18:20
  • 1
    @Dave See my earlier reply to pm100. It's equivalent to casting the pointer to (float*) and reading that pointer's value. – François Andrieux Aug 04 '17 at 18:21
  • 1
    @PasserBy This answer unfortunately breaks type aliasing rules. However, implementations are allowed to relax strict aliasing rules. Additionally, floating point types memory layout is implementation defined, so it's impossible to do this in a fully portable way. You can check with your implementation rather or not this conversion is possible. A less elegant but perhaps more portable alternative may be to `memcpy` from a `uint32_t*` to a `float*` provided the sizes are the same. I believe it would then be implementation defined (depending on that implementation's `float` layout). – François Andrieux Aug 04 '17 at 18:29
  • Strict aliasing rules are strict. – SergeyA Aug 04 '17 at 19:20