0

I got stuck about #pragma pack(1) wrong behavior when define a 6-bit field and assumes it as 8-bit. I read this question to solving my problem but it doesn't help me at all.

In Visual Studio 2012 I defined bellow struct for saving Base64 characters :

#pragma pack(1)
struct BASE64 {
    CHAR    cChar1 : 6;
    CHAR    cChar2 : 6;
    CHAR    cChar3 : 6;
    CHAR    cChar4 : 6;
};

Now I got its size with sizeof, but the result isn't what I expected :

printf("%d", sizeof(BASE64));      // should print 3

Result : 4

I was expect that get 3 (because 6 * 4 = 24, so 24 bit is 3 byte)

Event I tested it with 1-bit field instead and got correct size (1-byte) :

#pragma pack(1)
struct BASE64 {
    CHAR    cChar1 : 2;
    CHAR    cChar2 : 2;
    CHAR    cChar3 : 2;
    CHAR    cChar4 : 2;
};

Actually, why 6-bit assumes 8-bit with #pragma pack(1)?

  • 9
    There is no "right behavior" when it comes to bit-fields, since they are so poorly specified by the standard. And packing on 1 byte alignment would mean 8 bits, yeah? There's no #pragma pack(0.6). In addition, asking about conforming behavior while using VS is nonsense. – Lundin Aug 21 '18 at 07:01
  • 1
    I suppose that the compiler doesn't want to bother with all this bit shifting and stuff. If you want bit packing use Ada.* – Jean-François Fabre Aug 21 '18 at 07:02
  • Layout of 3 bytes would require placing of `cChar2` (and `cChar3`) on 2 bytes, where they would require more complex operations (2 byte access + shifting and masking) to access than `cChar1` (or `cChar4`) (1 byte access + simpler shifting and masking). I suspect that MS simply didn't want to implement more former type and aligns to next byte boundary instead. – user694733 Aug 21 '18 at 07:11
  • @user694733 I defined before `structs` like this in `Qt` without any problem. And I was expect It would show the same behavior in `visual studio` – BattleTested_закалённый в бою Aug 21 '18 at 07:16
  • Qt is not a compiler, qtcreator does bundle a version of gcc on some platforms, so what you're observing is a difference between some version of gcc and msvc 2012 – PeterT Aug 21 '18 at 07:44
  • @Lundin: Visual Studio 2017 15.7 is a [fully conforming C++17 compiler](https://blogs.msdn.microsoft.com/vcblog/2018/05/07/announcing-msvc-conforms-to-the-c-standard/). – IInspectable Aug 21 '18 at 07:44
  • @PeterT I know `Qt` is not a compiler, And I meant that with `gcc` can achieve that but not in `msvc` . – BattleTested_закалённый в бою Aug 21 '18 at 07:48
  • @IInspectable is it possible in VS 2017? – BattleTested_закалённый в бою Aug 21 '18 at 07:49
  • No, [VS 2017 still returns 4](https://godbolt.org/z/KQyy4k). I think it's unlikely they'll change it, since it's standard conformant behavior as multiple people have argued already. – PeterT Aug 21 '18 at 07:55
  • That's not very likely. Microsoft aren't known to change implementations their customers (unduly) rely on. Especially when their implementation is fully conforming. You are going to have to create a workaround, if you need cross-platform guarantees on features, that are inherently implementation defined. – IInspectable Aug 21 '18 at 07:57
  • 2
    @IInspectable Good for them! This question is tagged C and VS2017 is, as far as I know, not yet a conforming implementation of ISO 9899:1990 released 28 years ago. Notably, the C standard has changed 3 times since then (4 times if counting C95 as a major update). – Lundin Aug 21 '18 at 09:03
  • @Lundin: The relevant rules that apply in context of this question are incorporated into C++ from the C standard. In context of this question, a conforming C++ compiler is also conforming to C (compare [bit field (C++)](https://en.cppreference.com/w/cpp/language/bit_field) and [bit field (C)](https://en.cppreference.com/w/c/language/bit_field)). Visual Studio does not fully conform to the C standard. Still, using it as a reference for a language feature that is fully conforming certainly isn't nonsense, as your blanket statement suggests. – IInspectable Aug 21 '18 at 10:10

3 Answers3

6

#pragma pack generally packs on byte boundaries, not bit boundaries. It's to prevent the insertion of padding bytes between fields that you want to keep compressed. From Microsoft's documentation (since you provided the winapi tag, and with my emphasis):

n (optional) : Specifies the value, in bytes, to be used for packing.

How an implementation treats bit fields when you try to get them to cross a byte boundary is implementation defined. From the C11 standard (secion 6.7.2.1 Structure and union specifiers /11, again my emphasis):

An implementation may allocate any addressable storage unit large enough to hold a bitfield. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.

More of the MS documentation calls out this specific behaviour:

Adjacent bit fields are packed into the same 1-, 2-, or 4-byte allocation unit if the integral types are the same size and if the next bit field fits into the current allocation unit without crossing the boundary imposed by the common alignment requirements of the bit fields.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
2

In some implementations, bit fields cannot span across variable boundaries. You can define multiple bit fields within a variable only if their total number of bits fits within the data type of that variable.

In your first example, there are not enough available bits in a CHAR to hold both cChar1 and cChar2 when they are 6 bits each, so cChar2 has to go in the next CHAR in memory. Same with cChar3 and cChar4. Thus why the total size of BASE64 is 4 bytes, not 3 bytes:

  (6 bits + 2 bits padding) = 8 bits
+ (6 bits + 2 bits padding) = 8 bits
+ (6 bits + 2 bits padding) = 8 bits
+ 6 bits
- - - - - - - - - - 
= 30 bits
= needs 4 bytes

In your second example, there are enough available bits in a CHAR to hold all of cChar1...cChar4 when they are 1 bit each. Thus why the total size of BASE64 is 1 byte, not 4 bytes:

  1 bit
+ 1 bit
+ 1 bit
+ 1 bit
- - - - - - - - - - 
= 4 bits
= needs 1 byte
paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • Actually, the standard says that whether or not they can cross boundaries is up to the implementation. You may want to clarify that. – paxdiablo Aug 21 '18 at 07:23
  • @paxdiablo perhaps so, but it is clearly the behavior of the implementation of the OP's compiler. – Remy Lebeau Aug 21 '18 at 07:27
2

The simple answer is: this is NOT wrong behavior.

Packing tries to put separate chunks of data in bytes, but it can't pack two 6-bit chunks in one 8-bit byte. So the compiler puts them in separate bytes, probably because accessing a single byte for retrieving or storing your 6-bit data is easier than accessing two consecutive bytes and handling some trailing part of one byte and some leading part from another one.

This is implementation defined, and you can do little about that. Probably there is an option for an optimizer to prefer size over speed – maybe you can use it to achieve what you expected, but I doubt the optimizer would go that far. Anyway the size optimization usually shortens the code, not data (as far as I know, but I am not an expert and I may well be wrong here).

CiaPan
  • 9,381
  • 2
  • 21
  • 35