8

The endianness of bitfields is implementation defined. Is there a way to check, at compile time, whether via some macro or other compiler flag, what gcc's bitfield endianness actually is?

In other words, given something like:

struct X {
    uint32_t a : 8;
    uint32_t b : 24;
};

Is there a way for me to know at compile time whether or not a is the first or last byte in X?

dbush
  • 205,898
  • 23
  • 218
  • 273
Barry
  • 286,269
  • 29
  • 621
  • 977
  • 1
    Why not simply convert to network endiannes with `htonl` and back with `ntohl`? – immortal Dec 01 '17 at 19:56
  • That aside, Endianness is machine defined, not compiler defined... And could change between machines with the exact same binary code. What exactly are you trying to achieve? – immortal Dec 01 '17 at 19:59
  • 2
    https://gcc.gnu.org/onlinedocs/gcc/Structures-unions-enumerations-and-bit-fields-implementation.html says it's "*Determined by ABI.*" – melpomene Dec 01 '17 at 19:59
  • 2
    @melpomene That's great. How do I know *how* it was determined by the ABI? – Barry Dec 01 '17 at 20:00
  • 1
    Uh. You could compile and run a test program at configure time? Unless you're cross-compiling, of course. – melpomene Dec 01 '17 at 20:01
  • I believe C++20 includes endianness, though I'm not sure about situation with bitfields. – Incomputable Dec 01 '17 at 20:08
  • Why do you assume "first" and "last" are the only options? The natural ordering on a PDP/11 could put it in the third byte. – Martin Bonner supports Monica Dec 01 '17 at 20:27
  • @immortal: Endianness is defined by the C implementation (which includes the compiler, if one is used). C implementations are typically heavily influenced by hardware, but the final word is that of the implementation. E.g., a C implementation running in a little-endian emulated virtual system on big-endian hardware is little-endian. – Eric Postpischil Dec 01 '17 at 21:38
  • @EricPostpischil What on earth is C running little endian emulated virtual system? The Endian-enss is determined by the way a `move.l` instruction is going to read/write the value to memory and what will `((char*)(&x))[0]` return once the move instruction is done. C has no control over it, and no sane compiler would add special code to deal with this. That's why you have functions like `hton` – immortal Dec 01 '17 at 22:47
  • @immortal: Nothing in the C standard requires a C implementation to implement anything with a `move.l` instruction. There are times when you are running on, say, an Intel processor, but you want to emulate the environment of another processor, perhaps because you are developing new software for the other processor but do not have hardware yet. The C implementation must provide the endianness it will ultimately have on the future hardware, but it has to run on the endianness of the current hardware. – Eric Postpischil Dec 02 '17 at 00:22
  • @EricPostpischil Are you talking about a cross-compiler? Because what you're saying makes no sense. A compiler generates and optimizes code for a target system. It will use that system's parameters, including endianness. It COULD generate code to enforce byte-wise operations on any-sized int, but it's ridiculously CPU intensive and I couldn't name a single compiler that would do such a thing. You could also run your code on a VM, but again, the emulated processor will decide the endianness of the system, not the compiler... – immortal Dec 02 '17 at 11:10
  • @immortal: No, not merely a cross-compiler. Yes, compilers can generate such code, and, yes, it may be CPU-intensive, but sometimes there is little or no choice when you need to accomplish a certain goal. Nonetheless, such implementations exist, and that demonstrates that the choice of endianness is ultimately up to the implementation, not the hardware. – Eric Postpischil Dec 02 '17 at 13:17
  • Discussing the endianness of bytes within words is not relevant to the OP's question. They want to check whether bit-fields are allocated starting at the most- or least-significant BIT within the BYTE. That decision is independent of the little/big/pdp/other-endianness of the underlying hardware. – kbro Sep 20 '21 at 13:51

2 Answers2

9

On Linux systems, you can check the __BYTE_ORDER macro to see if it is __LITTLE_ENDIAN or __BIG_ENDIAN. While this is not authoritative, in practice it should work.

A hint that this is the right way to do it is in the definition of struct iphdr in netinet/ip.h, which is for an IP header. The first byte contains two 4-bit fields which are implemented as bitfields, so the order is important:

struct iphdr
  {
#if __BYTE_ORDER == __LITTLE_ENDIAN
    unsigned int ihl:4;
    unsigned int version:4;
#elif __BYTE_ORDER == __BIG_ENDIAN
    unsigned int version:4;
    unsigned int ihl:4;
#else
# error "Please fix <bits/endian.h>"
#endif
    u_int8_t tos;
    u_int16_t tot_len;
    u_int16_t id;
    u_int16_t frag_off;
    u_int8_t ttl;
    u_int8_t protocol;
    u_int16_t check;
    u_int32_t saddr;
    u_int32_t daddr;
    /*The options start here. */
  };
dbush
  • 205,898
  • 23
  • 218
  • 273
  • 3
    It may be authoritative. See this [comment](https://gcc.gnu.org/ml/gcc/2004-09/msg00581.html) in the GCC mailing list: `Bit-fields are always assigned to the first available bit, possibly constrained by other factors, such as alignment. That means that they start at the low order bit for little-endian, and the high order bit for big-endian. This is the "right" way to do things. It is very unusual for a compiler to do this differently.` – user9041001 Dec 01 '17 at 21:22
  • It's only authoritative for GCC (okay, that was the OP's question). But beware, as other compiler-writers are free to do it their own way despite GCC's authors having the opinion `It is very unusual for a compiler to do this differently.` The copy of `netinet/ip.h` on my Linux system says "This file is part of the GNU C Library" so it's tightly bound to GCC. Your header files, on the other hand, ought to check `#ifdef __GNUC__` before assuming it's safe to use `__BYTE_ORDER` to control bit-field ordering. The `#error` is a cop-out too - what if `__BYTE_ORDER == __PDP_ENDIAN`? – kbro Sep 20 '21 at 13:36
1

It might be of some interest that when the bitfields are multiples of 8-bits across, it appears that endianness of the arcitecture does not matter.

See here [godbolt.org]

I chose the arm architecture in this godbolt example because that supports both big and little endian, and it is easy to compare the differences.

Note that whether the architecture is big or small endian, in both cases the 8-bit field is at the start of the struct.

I tested all of the compilers on godbolt that could generate readable assembly code for the is_8bit_tag_at_start function, and they all appeared to return true.

markt1964
  • 2,638
  • 2
  • 22
  • 54