Showing binary representation of floating point types in C++

Question

Consider the following code for integral types:

template <class T>
std::string as_binary_string( T value ) {
    return std::bitset<sizeof( T ) * 8>( value ).to_string();
}

int main() {
    unsigned char a(2);
    char          b(4);

    unsigned short c(2);
    short          d(4);

    unsigned int   e(2);
    int            f(4);

    unsigned long long g(2);
    long long h(4);

    std::cout << "a = " << +a << " " << as_binary_string( a ) << std::endl;
    std::cout << "b = " << +b << " " << as_binary_string( b ) << std::endl;
    std::cout << "c = " << c << " " << as_binary_string( c ) << std::endl;
    std::cout << "d = " << c << " " << as_binary_string( d ) << std::endl;
    std::cout << "e = " << e << " " << as_binary_string( e ) << std::endl;
    std::cout << "f = " << f << " " << as_binary_string( f ) << std::endl;
    std::cout << "g = " << g << " " << as_binary_string( g ) << std::endl;
    std::cout << "h = " << h << " " << as_binary_string( h ) << std::endl;

    std::cout << "\nPress any key and enter to quit.\n";
    char q;
    std::cin >> q;

    return 0;
}

Pretty straight forward, works well and is quite simple.

EDIT

How would one go about writing a function to extract the binary or bit pattern of arbitrary floating point types at compile time?

When it comes to floats I have not found anything similar in any existing libraries of my own knowledge. I've searched google for days looking for one, so then I resorted into trying to write my own function without any success. I no longer have the attempted code available since I've originally asked this question so I can not exactly show you all of the different attempts of implementations along with their compiler - build errors. I was interested in trying to generate the bit pattern for floats in a generic way during compile time and wanted to integrate that into my existing class that seamlessly does the same for any integral type. As for the floating types themselves, I have taken into consideration the different formats as well as architecture endian. For my general purposes the standard IEEE versions of the floating point types is all that I should need to be concerned with.

iBug had suggested for me to write my own function when I originally asked this question, while I was in the attempt of trying to do so. I understand binary numbers, memory sizes, and the mathematics, but when trying to put it all together with how floating point types are stored in memory with their different parts {sign bit, base & exp } is where I was having the most trouble.

Since then with the suggestions those who have given a great answer - example I was able to write a function that would fit nicely into my already existing class template and now it works for my intended purposes.

As far you're asking for the c++ standard library, there are none that I know of. — user0042, Dec 31 '17 at 09:59
I don't know what exactly you mean with _"C" style coding_, but I think that showing the bit representation of floating point values is somewhat beyond the c++ standard as you mentioned yourself in your question. — user0042, Dec 31 '17 at 10:03
[bit_cast](https://github.com/jfbastien/bit_cast/) the double to an appropriate integer type. — Oliv, Dec 31 '17 at 10:05
@user0042 yeah; It was easy to wrap the above for `integral types` into a class template, but also want to do the same for `floats`... — Francis Cugler, Dec 31 '17 at 10:05
@FrancisCugler I understood that. What I'm trying to tell you is that the bit representation of floating point types is beyond the standard and a compiler specific implementation detail. — user0042, Dec 31 '17 at 10:09
Endianness is about how the bytes are stored in memory, not about the values. `x & 1` always gives you the lowest bit, not affected by the memory layout. — Bo Persson, Dec 31 '17 at 11:00
@BoPersson I understand that, but I was referring to the order of the bit pattern itself. For example: we have `float n = 1.5f` and the binary rep would be `00111111110000000000000000000000` the ordering of the bits could be different from one architect to another, from one compiler to another... The above bit pattern should be little endian (intel), IEEE floating point MS Visual Studio on Win7. On a AMD running Linux using GCC or Clang the bit ordering may be different. — Francis Cugler, Dec 31 '17 at 11:08
I would like to give thanks to all of those who've given an answer and or a reply for they are much appreciated. I already understood a good deal about the internal representation of floating point types, but never truly had a need to get their binary bit representations. With all of your feed back, it helps in many ways either it be for a quick resolution to a current problem or even years from now if one has to look back at it as a reference. Happy New Year! — Francis Cugler, Dec 31 '17 at 18:40
How can I reword this question so that it would not be considered off topic to have it reopened? — Francis Cugler, Jan 04 '18 at 06:08
Made an edit to this post; I prefer for it to not be delete because I actually like looking back at the accepted answer from iBug! — Francis Cugler, Jan 05 '18 at 15:56

iBug · Accepted Answer · 2017-12-31T10:33:05.030

9

What about writing one by yourself?

static_assert(sizeof(float) == sizeof(uint32_t));
static_assert(sizeof(double) == sizeof(uint64_t));

std::string as_binary_string( float value ) {
    std::uint32_t t;
    std::memcpy(&t, &value, sizeof(value));
    return std::bitset<sizeof(float) * 8>(t).to_string();
}

std::string as_binary_string( double value ) {
    std::uint64_t t;
    std::memcpy(&t, &value, sizeof(value));
    return std::bitset<sizeof(double) * 8>(t).to_string();
}

You may need to change the helper variable t in case the sizes for the floating point numbers are different.

You can alternatively copy them bit-by-bit. This is slower but serves for arbitrarily any type.

template <typename T>
std::string as_binary_string( T value )
{
    const std::size_t nbytes = sizeof(T), nbits = nbytes * CHAR_BIT;
    std::bitset<nbits> b;
    std::uint8_t buf[nbytes];
    std::memcpy(buf, &value, nbytes);

    for(int i = 0; i < nbytes; ++i)
    {
        std::uint8_t cur = buf[i];
        int offset = i * CHAR_BIT;

        for(int bit = 0; bit < CHAR_BIT; ++bit)
        {
            b[offset] = cur & 1;
            ++offset;   // Move to next bit in b
            cur >>= 1;  // Move to next bit in array
        }
    }

    return b.to_string();
}

edited Dec 31 '17 at 10:33

answered Dec 31 '17 at 10:05

iBug

35,554
7
89
134

I'm sure the OP already thought of something like that. _`static_assert(sizeof(float) == sizeof(uint32_t));`_ is what makes this somewhat poor and non generic. – user0042 Dec 31 '17 at 10:06
1

hmm never thought about casting to `uint32_t` & `uint64_t`... but would different edians as well as encoding effect this? But yes I agree with user0042 want something more generic. – Francis Cugler Dec 31 '17 at 10:07
@FrancisCugler Different endians and encodings *surely* affects that. – iBug Dec 31 '17 at 10:07
Yeah floats are much harder because of their parts... `exp`, `mantissa`, `sign bit` and some encodings even have a `bit` that isn't stored. – Francis Cugler Dec 31 '17 at 10:08
1

I'm not really interested in doing `bit manipulation` or `bit twiddling`. I just want a visual representation and like to template it for any `basic arithmetic type`. However if one would need to do bit manipulation, this class template can send the `binary` representation in the form of a string to the user. – Francis Cugler Dec 31 '17 at 10:11
1

@FrancisCugler Definitely. Please wait while I update my answer. – iBug Dec 31 '17 at 10:12
Okay so I changed `numBytes` to `sizeof( T )` in your 2nd part... and I got this as a result: `float m = 1.2f` & `double n = 2.4` output: `m = 1.2 00111111100110011001100110011010001111111` & `n = 2.4 0100000000000011001100110011001100110011001100110011001100110011` respectively... I think this might work. – Francis Cugler Dec 31 '17 at 10:31
@FrancisCugler Good that you spot my typo. I changed the code a liitle for a better style. – iBug Dec 31 '17 at 10:33
Now the only thing is which is a different topic; question, or discussion, but figuring out how to determine or deal with different endian machine's bit layouts, as well as different floating point encodings... – Francis Cugler Dec 31 '17 at 10:37
2

@FrancisCugler Unfortunately representation of only **unsigned integral values** are defined by the standard. Representation of signed integrals are not defined (varies by implementation), not even floating point numbers. – iBug Dec 31 '17 at 10:40
@iBug yeah I know; that's why most `unions` & `bitfields` are not considered generic or portable... – Francis Cugler Dec 31 '17 at 10:48

score 3 · Answer 2 · edited Jan 03 '18 at 10:47

You said it doesn't need to be standard. So, here is what works in clang on my computer:

#include <iostream>
#include <algorithm>

using namespace std;

int main()
{
  char *result;
  result=new char[33];
  fill(result,result+32,'0');
  float input;
  cin >>input;
  asm(
"mov %0,%%eax\n"
"mov %1,%%rbx\n"
".intel_syntax\n"
"mov rcx,20h\n"
"loop_begin:\n"
"shr eax\n"
"jnc loop_end\n"
"inc byte ptr [rbx+rcx-1]\n"
"loop_end:\n"
"loop loop_begin\n"
".att_syntax\n"
      :
      : "m" (input), "m" (result)
      );
  cout <<result <<endl;
  delete[] result;
  return 0;
}

This code makes a bunch of assumptions about the computer architecture and I am not sure on how many computers it would work.

EDIT:
My computer is a 64-bit Mac-Air. This program basically works by allocating a 33-byte string and filling the first 32 bytes with '0' (the 33rd byte will automatically be '\0').
Then it uses inline assembly to store the float into a 32-bit register and then it repeatedly shifts it to the right by one bit.
If the last bit in the register was 1 before the shift, it gets stored into the carry flag.
The assembly code then checks the carry flag and, if it contains 1, it increases the corresponding byte in the string by 1.
Since it was previously initialized to '0', it will turn to '1'.

So, effectively, when the loop in the assembly is finished, the binary representation of a float is stored into a string.

This code only works for x64 (it uses 64-bit registers "rbx" and "rcx" to store the pointer and the counter for the loop), but I think it's easy to tweak it to work on other processors.

"You said it doesn't need to be standard". The OP does not say this. The OP says you don't have to use `std` meaning the standard library. — JeremyP, Jan 04 '18 at 10:25

score 2 · Answer 3 · answered Dec 31 '17 at 17:24

2

An IEEE floating point number looks like the following

sign  exponent   mantissa
1 bit  11 bits    52 bits

Note that there's a hidden 1 before the mantissa, and the exponent is biased so 1023 = 0, not two's complement. By memcpy()ing to a 64 bit unsigned integer you can then apply AND and OR masks to get the bit pattern. The arrangement could be big endian or little endian. You can easily work out which arrangement you have by passing easy numbers such as 1 or 2.

answered Dec 31 '17 at 17:24

Malcolm McLean

6,258
1
17
18

Yes; I've already read articles, documents & white papers about the IEEE standard... and that there are actually a couple of versions. However, the mount of bits for the exponent & mantissa will vary depending on both the size of the floating point type via `float`, `double` or `long double*` provided the compiler - os supports the `long double`, as well as some other factors. – Francis Cugler Dec 31 '17 at 18:46

score 2 · Answer 4 · answered Dec 31 '17 at 18:29

Generally people either use std::hexfloat or cast a pointer to the floating-point value to a pointer to an unsigned integer of the same size and print the indirected value in hex format. Both methods facilitate bit-level analysis of floating-point in a productive fashion.

score 0 · Answer 5 · answered Dec 31 '17 at 17:57

You could roll your by casting the address of the float/double to a char and iterating it that way:

#include <memory>
#include <iostream>
#include <limits>
#include <iomanip>

template <typename T>
std::string getBits(T t) {
    std::string returnString{""};
    char *base{reinterpret_cast<char *>(std::addressof(t))};
    char *tail{base + sizeof(t) - 1};
    do {
        for (int bits = std::numeric_limits<unsigned char>::digits - 1; bits >= 0; bits--) {
            returnString += ( ((*tail) & (1 << bits)) ? '1' : '0');
        }
    } while (--tail >= base);
    return returnString;
}

int main() {
    float f{10.0};
    double d{100.0};
    double nd{-100.0};
    std::cout << std::setprecision(1);
    std::cout << getBits(f) << std::endl;
    std::cout << getBits(d) << std::endl;
    std::cout << getBits(nd) << std::endl;
}

Output on my machine (note the sign flip in the third output):

01000001001000000000000000000000
0100000001011001000000000000000000000000000000000000000000000000
1100000001011001000000000000000000000000000000000000000000000000

Showing binary representation of floating point types in C++

5 Answers5