-2

After I had solved my issue to this question I went on to expand this version of my code to incorporate the unions of the data fields from my previous template versions with this version and I have this so far:

main.cpp

#include <iostream>
#include <type_traits>
#include "Register.h"

int main() {
    using namespace vpc;

    std::cout << std::boolalpha;

    std::cout << "std::bitset<64> is trivially copyable "
        << std::is_trivially_copyable<std::bitset<64>>::value << '\n'
        << "QWord is trivially copyable "
        << std::is_trivially_copyable<QWord>::value << '\n'
        << "DWord is trivially copyable "
        << std::is_trivially_copyable<DWord>::value << '\n'
        << "Word is trivially copyable "
        << std::is_trivially_copyable<Word>::value << '\n'
        << "Byte is trivially copyable "
        << std::is_trivially_copyable<Byte>::value << '\n'
        //      << "Bits is trivially copyable "
        //<< std::is_trivially_copyable<Bits>::value << '\n'
        << "My Register is trivially copyable "
        << std::is_trivially_copyable<Register>::value << "\n\n";


    std::cout << "sizeof(std::bitset<Byte>) = "  << sizeof(Byte)  << " bytes\n";
    std::cout << "sizeof(std::bitset<Word>) = "  << sizeof(Word)  << " bytes\n";
    std::cout << "sizeof(std::bitset<DWord>) = " << sizeof(DWord) << " bytes\n";
    std::cout << "sizeof(std::bitset<QWord>) = " << sizeof(QWord) << " bytes\n";
    std::cout << "sizeof(Register) = "      << sizeof(Register) << " bytes\n\n";

    Register r;

    std::cout << "sizeof(Register::byte) = " << sizeof(r.byte)  << " bytes\n";
    std::cout << "sizeof(Register::Byte) = " << sizeof(r.byte) / sizeof(r.byte[0]) << " bytes\n";

    std::cout << "sizeof(Register::word) = " << sizeof(r.word)  << " bytes\n";
    std::cout << "sizeof(Register::Word) = " << sizeof(r.word) / sizeof(r.word[0]) << " bytes\n";

    std::cout << "sizeof(Register::dword) = " << sizeof(r.dword) << " bytes\n";
    std::cout << "sizeof(Register::DWord) = " << sizeof(r.dword) / sizeof(r.dword[0]) << " bytes\n";

    std::cout << "sizeof(Register::value) = " << sizeof(r.value) << " bytes\n";

    std::cout << "sizeof(Register) = " << sizeof(r) << " bytes\n\n";

    r.value = 0xFFFFFFFFFFFFFFFF;
    std::cout << "value = " << r.value.to_ullong() << '\n' << r.value << '\n';
    for (std::uint16_t i = 0; i < 8; i++) {
        std::cout << "byte_" << i << " : " << r.byte[i] << '\n';
    }

    return EXIT_SUCCESS;
}

Register.h

#pragma once

#include <algorithm>
#include <bitset>
#include <string>
#include <vector> // include for typedefs below.

namespace vpc {
    typedef std::int8_t  i8;
    typedef std::int16_t i16;
    typedef std::int32_t i32;
    typedef std::int64_t i64;

    const std::uint16_t BYTE = 0x08;
    const std::uint16_t WORD = 0x10;
    const std::uint16_t DWORD = 0x20;
    const std::uint16_t QWORD = 0x40;

    typedef std::bitset<BYTE>  Byte;
    typedef std::bitset<WORD>  Word;
    typedef std::bitset<DWORD> DWord;
    typedef std::bitset<QWORD> QWord;

    struct Register {

        union {
            QWord value{ 0 };

            union {
                DWord dword[2];
                struct {
                    DWord dword0;
                    DWord dword1;
                };
            };

            union {
                Word word[4];
                struct {
                    Word word0;
                    Word word1;
                    Word word2;
                    Word word3;
                };
            };

            union {
                Byte byte[8];
                struct {
                    Byte byte0;
                    Byte byte1;
                    Byte byte2;
                    Byte byte3;
                    Byte byte4;
                    Byte byte5;
                    Byte byte6;
                    Byte byte7;
                };
            };
        };

        Register() : value{ 0 } {}
    };

    Register reverseBitOrder(Register& reg, bool copy = false);

} // namespace vpc

Register.cpp

#include "Register.h"

namespace vpc {

    Register reverseBitOrder(Register& reg, bool copy) {
        auto str = reg.value.to_string();
        std::reverse(str.begin(), str.end());

        if (copy) { // return a copy
            Register cpy;
            cpy.value = QWord(str);
            return cpy;
        }
        else {
            reg.value = QWord(str);
            return {};
        }
    }

} // namespace vpc

Output

std::bitset<64> is trivially copyable true
QWord is trivially copyable true
DWord is trivially copyable true
Word is trivially copyable true
Byte is trivially copyable true
My Register is trivially copyable true

sizeof(std::bitset<Byte>) = 4 bytes
sizeof(std::bitset<Word>) = 4 bytes
sizeof(std::bitset<DWord>) = 4 bytes
sizeof(std::bitset<QWord>) = 8 bytes
sizeof(Register) = 32 bytes

sizeof(Register::byte) = 16 bytes
sizeof(Register::Byte) = 4 bytes
sizeof(Register::word) = 16 bytes
sizeof(Register::Word) = 4 bytes
sizeof(Register::dword) = 8 bytes
sizeof(Register::DWord) = 2 bytes
sizeof(Register::value) = 8 bytes
sizeof(Register) = 32 bytes

value = 18446744073709551615
1111111111111111111111111111111111111111111111111111111111111111
byte_0 : 11111111
byte_1 : 11111111
byte_2 : 11001100
byte_3 : 11001100
byte_4 : 11001100
byte_5 : 11001100
byte_6 : 11001100
byte_7 : 11001100

After looking at the printed out data for the sizes of the bitset types then comparing them to the actual sizes of them as members of a struct within in a union. I'm trying to figure out what's going on under the hood here.

I'm not sure if I'm performing the sizeof calculations correctly of if it has to do with the internal storage of bitset I'm trying to get a grasp on the data alignment within the context of unions as members of a structure where the underlying type is that of std::bitset types. From the header you can see that there are 4 variations of these: bitset<8> = Byte, bitset<16> = Word, bitset<32> = DWord & bitset<64> = QWord

In essences there should be a divisible mapping of these:

// each [] = 1 byte or 8 bits for simplicity
bitset<64> = [] [] [] [] [] [] [] []
bitset<32> = [] [] [] []
bitset<16> = [] []
bitset<8>  = []

So when I try to use them in a union as such:

union {
    QWord q;

    union {
        DWord d[2];
        struct {
            DWord d_0;
            DWord d_1;
        };
    };

    union {
        Word w[4];
        struct {
            Word w_0;
            Word w_1;
            Word w_2;
            Word w_3;
         };
    };

    union {
        Byte b[8];
        struct {
            Byte b_0;
            Byte b_1;
            Byte b_2;
            Byte b_3;
            Byte b_4;
            Byte b_5;
            Byte b_6;
            Byte b_7; 
        };
    };
};

I would think that by using the pattern that I shown above this union that I would be able to pack the data into byte size alignments:

// each inner [] = 1 byte or 8 bits
// and each outer [] = index into array

         0   1   2   3   4   5   6   7
value = [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]

                 0               1
dword[2] = [[] [] [] []],  [[] [] [] []]

             0        1        2        3
word[4] = [[] []], [[] []], [[] []], [[] []] 

             0     1     2     3     4     5     6     7
byte[8]  = [[ ]] [[ ]] [[ ]] [[ ]] [[ ]] [[ ]] [[ ]] [[ ]]

However this doesn't seem to be happening as I would expect.

My overall goal is to simulate the pattern above that I had expressed so that the basic size of a Register is 64 bits or 8 bytes wide and through the use of unions I would be able to access sub bytes, word or dwords from the full qword.

Would you please be able to elaborate on what I'm missing here? I'm not sure if it has to do with how std::bitset is stored in memory, if it has to do with a structures alignment or if it has to deal with unions themselves.

Francis Cugler
  • 7,788
  • 2
  • 28
  • 59
  • 1
    Just to remind you, type punning using unions is undefined behavior in C++. `Byte byte[4];` is a typo, should be `[8]` ? What compiler and compiler options are you using? [I can't compile this](https://godbolt.org/z/6J-aHL). – KamilCuk May 07 '19 at 20:09
  • @KamilCuk yes it is a typo thank you for pointing that out! – Francis Cugler May 07 '19 at 20:16
  • @KamilCuk I understand that but almost everything you do in the C languages when it comes to pointers and other similar operations, type casting etc. can be defined as type punning and UB. – Francis Cugler May 07 '19 at 20:20
  • @KamilCuck would there be another way to represent the model that I've proposed? – Francis Cugler May 07 '19 at 20:20
  • 2
    Is this whole question just... why is `sizeof(bitset<8>) != 1`? – Barry May 07 '19 at 20:23
  • @Barry It's hard to put it into words. I was thinking that since bitset<64> has 64 values I could also represent that as 2 - bitset<32> objects as well as 4 - bitset<16> objects as well as 8 - bitset<8> objects and through the use of unions within a structure I could access any of the individual bytes, words or dwords from within the 64 bit register. I wasn't sure if this had to pertain to the internal storage of std::bitset, the padding - byte alignment of a structure or the properties of unions within a struct. – Francis Cugler May 07 '19 at 20:27
  • @FrancisCugler: "*almost everything you do in the C languages when it comes to pointers and other similar operations, type casting etc. can be defined as type punning and UB.*" No, it isn't. You can do plenty of stuff in C++ that doesn't involve type punning. Type punning happens when you don't believe that objects are anything other than bitpatterns in memory. – Nicol Bolas May 07 '19 at 20:34
  • @NicolBolas True; I was being a bit rhetorical... – Francis Cugler May 07 '19 at 20:39
  • @FrancisCugler C is not C++. You can do that in C, it's explicitly allowed by the standard (although may result in a trap represenatation [note95](https://port70.net/~nsz/c/c11/n1570.html#note95)). In C++ that's undefined behavior, the standard explicitly says it's undefined behavior [classes.unions 5.3 example](http://eel.is/c++draft/class.union#5.3) – KamilCuk May 07 '19 at 20:52
  • @KamilCuk Thank you for the clarification I don't have a copy of the most recent standard and only pieces of older outdated ones... – Francis Cugler May 07 '19 at 21:03

2 Answers2

1

What you want cannot be done in the way that you want to do it. std::bitset makes no guarantees about its size, so there's no expectation that a bitset<8> will have the size of a byte. Nor is there any way for you to access the members of these bitsets if they are not the active union member.

What you want to do is:

  1. Store a uint64_t.
  2. Access various subsets of the bits of that uint64_t through a range-compatible object that allows you to manipulate them.

So just implement that. What you need is not bitset, but a bit-range view type, one that allows you to interpret and manipulate any contiguous sequence of bits in that uint64_t as a range. Basically, you want the interface of bitset, but through a reference to the storage (and a particular range of that storage), not by being the storage. You don't store these ranges; you generate ranges upon request.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • I agree with what you are saying here; but now I have to go back to the drawing board and redesign the internal structure of my code. Originally I wasn't using std::bitset and tried to incorporate the use of it and know I'm starting to see some of the benefits but yet some of its downfalls or weaknesses. Your approach sounds good and should be a nice challenge to implement. – Francis Cugler May 07 '19 at 21:06
0

There's nothing in the language standard that specifies how bitset handles storage internally. One implementation I looked at uses an unsigned long array for storage of 32 or fewer bits (unsigned long long for more than 32). This is probably done for efficiency.

With this storage scheme, your Byte, Word, and DWord types will all take four bytes, even when they won't all be used. Storing arrays of these in your larger unions will cause the sizeof the union to grow, as there are unused bytes in each bitset.

In order to eliminate these unused bytes, you'll have to use something else.

1201ProgramAlarm
  • 32,384
  • 7
  • 42
  • 56
  • That's good to know; and thank you for your response. I could of created a basic structure that contains a std::bitset<8> member and call it a byte, then build another struct with 2 bytes and call that a word and so on... but accessing the members would have many levels of indirection that I'm trying to avoid. I'd like to be able to access any of them from the same instance of a struct. – Francis Cugler May 07 '19 at 20:23