1

I have the following layout of memory (pseudo code):

struct {
    union {
        fieldA : 45;
        struct {
            fieldB1  :  12;
            fieldB2  :  33;
        }
    }
    fieldC : 19;
}

i.e., the memory of field A can be sometimes used for other purposes (fields B1 and B2). I want this struct to be as packed as possble, i.e. 64bit in size.

It seems like no matter what I do (packed attributes for example) the union is always padded with 3 bits to get 48bits (6Bytes) before fieldC (which is also padded of course).

RBH
  • 11
  • 2
  • 1
    I'm not sure it's going to be possible, because the union is a separate entity that will be padded at the end. You will likely have the same problem if using a nested structure. – Some programmer dude Jul 19 '23 at 08:44
  • 3
    This very likely depends upon which compiler you're using. The C language itself doesn't mandate a particular arrangement of fields here. – Toby Speight Jul 19 '23 at 08:52
  • 1
    The language doesn't say if `fieldB1` is to be the first or last 12 bits of the underlying type. And it differs between compilers. – BoP Jul 19 '23 at 10:18

4 Answers4

4

How to construct a C struct/union with a very odd arrangement of bitfields?

64bit in size.

Use a typedef or a struct with a single uint64_t. Write getters and setters for every field using bitfield operations.

KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • That will work. But it is not readable. I would have to add a comment explaining about the layout. I will have to think this through. – RBH Jul 19 '23 at 08:57
  • 4
    But it is reliable, especially in case of volatile ( https://lwn.net/Articles/478657/ ). People are using bitmasks and #define and enums with hardware registers since forever. – KamilCuk Jul 19 '23 at 09:02
3

As a possible workaround, you need to duplicate some fields and make a union of two structures using all fields:

struct S {
    union {
        struct {
            uint64_t fieldA : 45;
            uint64_t fieldC : 19;
        } a;
        struct {
            uint64_t fieldB1 : 12;
            uint64_t fieldB2 : 33;
            uint64_t fieldC : 19;
        } b;
    };
};
Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
  • I thought about it. But I thought that if field C would change in the future (I need to change the layout from time to time) then I will need to change it twice. I can typedef field C, but it's quite ugly to read and maintain. – RBH Jul 19 '23 at 08:56
  • 2
    45 is not in the range of unsigned... – Could achieve it [more elegantly](https://godbolt.org/z/sTPTGvbYn) with C11 anonymous structs, though I'm not sure if fully legal – is it? – Aconcagua Jul 19 '23 at 09:02
  • 1
    @Aconcagua A bit-field can be declared as a qualified or unqualified `bool`, `signed int` or `unsigned int` or other implementation-defined types, so whether the above code is legal is implementation-defined. C23 will also allow bit-fields to be declared with signed and unsigned *bit-precise* integer types but those are different types to the *exact-width* types of the same width, i.e. the *bit-precise* type `unsigned _BitInt(64)` is the same width as `uint64_t` but a distinct type. `unsigned _BitInt(64)` is a C23 standard-supported bit-field type, `uint64_t` may be implementation-supported. – Ian Abbott Jul 19 '23 at 15:09
  • @IanAbbott Well, so no difference in legality comparing the version in the answer and in my comment then ;) I was worried about crossing struct borders when setting a bit-field member in one struct and the other one in the other struct ;) – Aconcagua Jul 19 '23 at 15:22
  • @Aconcagua There should be nothing to worry about as long as the both structs end up containing a single addressable storage unit (at least 64 bits wide) containing all the bit-field members. Of course, the actual packing order of the bit-fields within that single storage unit would still be implementation-defined! – Ian Abbott Jul 19 '23 at 16:35
2

Generally speaking, do not use bitfields. Certainly not in code that you want to be portable. Their semantics are much more loosely defined than the uninitiated tend to assume, and they can be surprising in multiple ways. Aspects that are important for some potential uses are unspecified or implementation-defined and do vary between implementations.

You can rely, however, on every object other than a bitfield to have a representation comprising a contiguous sequence of one or more (complete) bytes (C23 6.2.6.1/2).* Supposing, then, that your system's bytes are 8 bits wide, you cannot have a struct or union whose representation comprises exactly 45 bits, as 45 is not a multiple of 8. In your case, the inner struct will have a size of at least 6 bytes, so the union containing it must also be at least that large. The outer struct contains an additional member requiring 19 bits, so the overall struct must be at least 6 + 4 = 10 bytes in size.

My first recommendation would be my lead: don't use bitfields. For example,

struct foo {
    union {
        uint64_t fieldA;
        struct {
            uint16_t fieldB1  :  12;
            uint64_t fieldB2  :  33;
        };
    };
    uint32_t fieldC;
};

Of course, that does not achieve your objective of packing it into 64 bits, but C does not define any way to ensure such packing for a structure or union. Also, that's already leaning on implementation-defined behavior with respect to which type specifiers bitfield members may have.

You could consider a union of structs, such as @someprogrammerdude recommended. But as long as 64 bits is enough, you could also consider an ordinary packed integer, perhaps with supporting macros or functions:

typedef uint64_t my_fields;

#define FIELDA(mf) ((mf) >> 19)
#define SET_FIELDA(mf, v) do { \
    my_fields *temp = &(mf); \
    *temp = (*temp & (~(uint64_t)0 >> 45)) | (((uint64_t) (v)) << 19); \
} while (0)
// ...

Since you expressed some concern that you might need to be flexible toward changes in the data structure, wrapping accesses with macros or functions and abstracting the data type provide a great deal of flexibility for doing so.


* This appears to include C23 bit-precise integer types, so I guess those will including padding bits under many circumstances.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
1

Within the scope of the C programming language, structs and unions are actually not that well suited for creating precise memory layouts down to the bit level. Even with compiler specific packing extensions/pragmas, you still might end up with elements getting padded and aligned to meet certain requirements of the implementation. The only reliable way of doing bit level precise accesses in standard C is through (arrays of) primitive types and bit operations to mask out and extract/insert the desired values. And then there's also the whole endianess issue, which I'm going to lazily ignore here.

In your particular case, you could do it like this:

/* using a struct for encapsulation, to get some degree of static typing */
struct mybitfieldtype { uint64_t _; };

#define MYBITFIELD_ELEMENT(name, W, S) \
static inline uint64_t \
    mybitfieldtype_get_##name( struct mybitfieldtype const v ) \
        { return (v._ >> (S)) & ((uint64_t)1<<(W))-1; } \
\
static inline struct mybitfieldtype \
    mybitfieldtype_set_##name( struct mybitfieldtype const v, uint64_t const x ) \
        { return (struct mybitfieldtype) \
            { (v._ & ~(((uint64_t)1<<(W))-1<<(S))) \
            | (x   &  (((uint64_t)1<<(W))-1))<<(S) }; }

MYBITFIELD_ELEMENT(fieldA,  45,  0)
MYBITFIELD_ELEMENT(fieldB1, 12,  0)
MYBITFIELD_ELEMENT(fieldB2, 33, 12)
MYBITFIELD_ELEMENT(fieldC,  19, 45)
datenwolf
  • 159,371
  • 13
  • 185
  • 298
  • The disadvantage is that function declarations of the *public* interface are hidden behind macros! I'd bite the bullet and pre-declare the functions explicitly before the actually implementing them via macro 'magic'. – Aconcagua Jul 19 '23 at 09:35
  • I personally prefer `(uint64_t)-1 << (W)` for most significant bits set and `~((uint64_t)-1 << (W))` for the least significant bits, slightly less operations required ;) – Aconcagua Jul 19 '23 at 09:51
  • This is great, but because navigating code generated by #define is a literal nightmare and debugging happens with a lot of cursing, I suggest writing out the function definitions by hand, and use macros inside the function body `uint16_t type_get_fieldA(struct type v) { return GET_FIELD(v._, 45, 0); }`. – KamilCuk Jul 19 '23 at 10:54
  • @Aconcagua: Normally I'd agree on the bit masking, but in the past I observed some compilers to soil the bit level optimizations, if they see set MSB bits overflow (even on unsigned). So instead I bite the bullet and write it in a way that will not cause overflows. It *should* not make a difference, but sometimes it does. – datenwolf Jul 20 '23 at 07:07
  • @datenwolf Uh, oh, non-standard-compliant compilers :( Curios now, which one did so? – Aconcagua Jul 20 '23 at 13:04
  • @Aconcagua: Not violating the standard, just generating slow, or outright weird code. Some older, but not so old versions of GCC and Clang. Just search their issue trackers for the tag "missed optimization bit shift overflow". E.g. https://github.com/llvm/llvm-project/issues?q=label%3Amissed-optimization+bit+shift+overflow or this one here https://gcc.gnu.org/bugzilla/show_bug.cgi?id=23810 which fall along the same lines. – datenwolf Jul 20 '23 at 13:51
  • @datenwolf These masks are compile time constants in given case, so shouldn't be affected. Interesting still. In the end, though I personally will remain with the simpler expressions even in runtime unless profiling proves these get a bottleneck ;) – Aconcagua Jul 20 '23 at 14:10