3

Playing with the code presented in this question, I observed an increase in size of a struct when an 8 bit wide enum is used instead of an uint8_t type.

Please see these code examples:

Code Option A

typedef enum { A, B, C, MAX = 0xFF } my_enum;

struct my_compact_struct_option_a
{
    my_enum field1 : 8; // limiting enum size to 8 bits 
    uint8_t second_element[6];
    uint16_t third_element;
};

The offset of the second variable in this struct second_element is 1. This indicates that the enum field1 is limited to the size uint8_t. However, the size of the struct is 12 bytes. That's unexpected for me.

Compare this to

Code Option B

typedef uint8_t my_type;

struct my_compact_struct_option_b
{
    my_type field1; 
    uint8_t second_element[6];
    uint16_t third_element;
};

Here, offset of second_element is also 1, however, the size of this struct is 10 bytes. That's expected.

Why is the struct in the first code example increased to 12 bytes?

You can always try this code for yourself.

Daniel K.
  • 919
  • 10
  • 17
  • 2
    This is not a duplicate, so I am voting to reopen. The marked duplicate is about how a structure’s size is determined as a function of the sizes and alignment requirements of its members. However, the crux of this question is not about how the sizes and alignment requirements affect the structure’s size but why one member has an alignment requirement of four byte but a size of one byte. – Eric Postpischil Jul 20 '22 at 09:44
  • This is more simply reproduced using `struct foo { int a : 8; char b; };`, after which `offsetof(struct foo, b)` is 1, `sizeof (struct foo)` is 4, and `_Alignof (struct foo)` is 4, using Apple Clang 11 with default options on macOS 10.14.6. Although the compiler is using only one byte for the bit-field, it adopts the alignment of its underlying type for the structure. But it does not require that alignment for the bit-field; inserting a `char x;` before the bit-field leaves the structure size at four bytes and moves the `b` field to offset 2, meaning `a` is at 1. What is the reason for this? – Eric Postpischil Jul 20 '22 at 09:47
  • [On Compiler Explorer.](https://godbolt.org/z/TsPz8E5Yo) – Eric Postpischil Jul 20 '22 at 09:50
  • Note that, for your code and that posted on Compiler Explorer by @Eric, MSVC gives an offset of the second member as `4` bytes. That's at least sefl-consistent ... looks like most other compilers are doing something *very* strange, here. – Adrian Mole Jul 20 '22 at 12:04

2 Answers2

1

As mentioned in the other answer, the C standard states that an implementation may use any storage unit large enough to hold the bitfield. Since by default an enum is effectively an int, most compilers will use an int sized storage unit for the bitfield. In particular, both gcc and MSVC will create a 4 byte enum and a 12 byte struct.

In the case specified in the comments:

 struct foo { int a : 8; char b; };

gcc and clang give it a size of 4, while MSVC gives it a size of 8.

So what appears to be happening is that a is residing in an int sized storage unit, since that is the base type of the bitfield. The alignment of the struct is then 4 because that is the size of the largest field, specifically the int sized unit that holds the bitfield.

Where gcc and clang seem to differ from MSVC is that gcc and clang will allow non-bitfields to occupy the same storage unit as bitfields if there is sufficient space to do so, while MSVC keeps bitfields in their own storage units.


If you want to make the enum type smaller, there are implementation specific ways of doing this.

In gcc, use can either use the packed attribute:

typedef enum __attribute__((__packed__)) { A, B, C, MAX = 0xFF } my_enum;

Or you can pass the -fshort-enums flag to shrink the size of all enums. Both will cause my_enum to be 1 byte in size and struct my_compact_struct_option_a will be 10 bytes.

clang lets you specify the size of the enum with the following syntax:

typedef enum : char { A, B, C, MAX = 0xFF } my_enum;
dbush
  • 205,898
  • 23
  • 218
  • 273
  • This is not the answer. As shown in the code examples in the question and the comments, Clang is using one byte for the container of the bit-field and uses a one-byte alignment requirement for it. Yet it gives the structure a four-byte alignment requirement even though no member in it, including the bit-field container, has a four-byte alignment requirement. – Eric Postpischil Jul 25 '22 at 17:05
  • @EricPostpischil It looks like the bitfield is still in a 4 byte wide storage unit, but is sharing that storage unit with a non-bitfield. Edited to add more detail. – dbush Jul 25 '22 at 21:02
  • That is an interesting hypothesis. Is there a way we can test it? In other words, is there some different behavior that would occur for “the compiler lets other members share the bit-field container” than “the bit-field container is a single byte but the alignment of the bit-field’s underlying type is applied to the structure even though it is not applied to the container”? (Phrased that way, Occam’s Razor favors the first.) Or would we have to dig into the compiler source code for clues? – Eric Postpischil Jul 25 '22 at 22:25
  • Clang says `sizeof (struct { int a : 8, : 0, b : 3; })` is 8, indicating that going to another storage unit (due to `: 0`) uses four bytes, favoring this hypothesis that the storage unit is in fact four bytes even though the bit-field takes up only one byte. – Eric Postpischil Jul 25 '22 at 22:33
  • `sizeof (struct { int a : 8; char : 0; int b : 3; })` is 4, consistent with `: 0` advancing to a `char` storage unit even though it is inside the `int` storage unit of `a`. – Eric Postpischil Jul 25 '22 at 22:42
  • `sizeof (struct foo { int a : 8; short : 0; char b : 8; short c : 8; })` is 4. `offsetof` will not calculate the offset of bit-fields, but we can figure them out this way: Put `'a'` in `a`, `'b'` in `b`, and `'c'` in `c` and then print the bytes of the memory of such a structure (with “?” for unprintable characters). The output is “a?bc”. So `b` is at offset 2, where we would expect it to be after `short : 0`. And `c` is at 3. Its two-byte storage unit must have been started at offset 2, so the `b` member is inside the `c` storage unit which is inside the `a` storage unit. Wild. – Eric Postpischil Jul 25 '22 at 22:45
0

The alignment requirement of a bit-field is not specified by the C Standard, so it is (implicitly) implementation-defined. From this Draft C17 Standard (bold emphasis mine):

6.7.2.1 Structure and union specifiers


11      An implementation may allocate any addressable storage unit large enough to hold a bit-field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.

Thus, while that paragraph dictates that (say) a further added my_enum field2 : 8; should be 'packed' into the same storage unit as field1 (assuming there remains sufficient space),2 it allows freedom for compilers to decide what they consider the best alignment requirement (and size) for that storage unit.

It would appear that most compilers (from those available on Compiler Explorer) choose to impose the alignment requirement of the 'base' type on bit-fields.1 Further, as an enum in C has a (fixed) underlying type of int, that would mean that your field1 (and, consequently, your my_compact_struct_option_a) has the alignment requirement of an 'int' – likely to be 4 bytes on many/most systems.


1 This makes sense, when you think about how access to the bit-field would be realized: If it is to be 'used' as an int, then it needs to be accessed as an int.

2 For example, adding further bit-field members to your structure, immediately after field1, doesn't increase the overall size of the structure until the total size of those added bit-fields exceeds 24 bits.

Adrian Mole
  • 49,934
  • 160
  • 51
  • 83
  • The problem here is not the alignment of the storage unit holding the bit-field. As my comments note, if we insert `char x;` before the bit-field, the bit-field moves to offset 1, proving its alignment requirement is merely one byte. Yet the structure appears to have an alignment requirement of four bytes. The compiler adopted the alignment requirement of the underlying type (`int` or `my_enum`) for the entire structure but not for the bit-field’s storage unit. – Eric Postpischil Jul 20 '22 at 11:09
  • You might add that the structure alignment may possibly be reduced by qualifying the `field1` member as `alignas(char)`. – chqrlie Jul 20 '22 at 11:28
  • @chqrlie I get an IDE warning, *attribute "_Alignas" does not apply to bit-fields* (but no compiler warning or error). – Adrian Mole Jul 20 '22 at 11:54