0

This question is related to Why bit endianness is an issue in bitfields? and __LITTLE_ENDIAN_BITFIELD and __BIG_ENDIAN_BITFIELD?, but is different: Why linux names the MACROs as __LITTLE/BIG_ENDIAN_BITFIELD and places them in byteorder.h?

Instead of __LEAST_TO_MOST/MOST_TO_LEAST_BITFIELD(or something similar) in bitfield.h(or something similar)?

Linux

include/uapi/linux/ip.h

struct iphdr {
#if defined(__LITTLE_ENDIAN_BITFIELD)
    __u8    ihl:4,
        version:4;
#elif defined (__BIG_ENDIAN_BITFIELD)
    __u8    version:4,
        ihl:4;
#else
#error  "Please fix <asm/byteorder.h>"
#endif
    __u8    tos;
    __be16  tot_len;
    __be16  id;
    __be16  frag_off;
    __u8    ttl;
    __u8    protocol;
    __sum16 check;
    __struct_group(/* no tag */, addrs, /* no attrs */,
        __be32  saddr;
        __be32  daddr;
    );
    /*The options start here. */
};

include/uapi/linux/byteorder/little_endian.h

#ifndef __LITTLE_ENDIAN
#define __LITTLE_ENDIAN 1234
#endif
#ifndef __LITTLE_ENDIAN_BITFIELD
#define __LITTLE_ENDIAN_BITFIELD
#endif

Bitfields Allocation Order

"Endian"(i.e. byteorder) should have nothing to do with bitfields allocation order.

C17 Standard

Similar draft: N2310

6.7.2.1 Structure and union specifiers

[...]

Semantics

[...]

An implementation may allocate any addressable storage unit large enough to hold a bit-field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.

Linux Standard Base 5.0 Specification

Take "Big-Endian" S390 as an example. "Byte ordering" and "Bit-fields" are same-level sections, despite that "Bit-fields" requires "Bit-fields are allocated from left to right (most to least significant)." which aligns with byte order.

Bit Transmission Order

"Endian"(i.e. byteorder) should have nothing to do with bit transmission order(transmit most or least significant bit first), which should be dealt with in bus controller and NIC.

Expectation

There should be something like __LEAST_TO_MOST_BITFIELD/__MOST_TO_LEAST_BITFIELD MACRO in something like bitfield.h, instead of __LITTLE_ENDIAN_BITFIELD/__BIG_ENDIAN_BITFIELD MACRO in byteorder.h, because they are independent.


Other References

RFC791

Despite that there is "The order of transmission of the header and data described in this document is resolved to the octet level." in Appendix B, RFC791 still requires that Version = bytes[0] >> 4 and IHL = bytes[0] & 0x0f in Appendix B

Whenever an octet represents a numeric quantity the left most bit in the
diagram is the high order or most significant bit.  That is, the bit
labeled 0 is the most significant bit.  For example, the following
diagram represents the value 170 (decimal).


                            0 1 2 3 4 5 6 7
                           +-+-+-+-+-+-+-+-+
                           |1 0 1 0 1 0 1 0|
                           +-+-+-+-+-+-+-+-+

                          Significance of Bits
                               Figure 11.
YouJiacheng
  • 449
  • 3
  • 11
  • The order of individual members sharing the base integer type inside a bitfield is implementation defined. The order of the two four-bit fields in the IP header is fixed by the IP specification. The Linux structure makes sure that the two four-bit members are in the order defined by the IP specification. – Some programmer dude Dec 07 '22 at 08:15
  • Yes, IP spec fix the order of the two 4 bits fields. But you cannot meet this requirement by swapping the bit-fields declarations according to machine endianness, since the allocation order is implementation-defined -- MAY or MAY NOT align with machine endianness(which is byte order not bit order). – YouJiacheng Dec 07 '22 at 08:24
  • That's why there are two different macros, one for "normal" endianness, and one for bitfields. This allows the kernel to use different depending on the compiler and its implementation. – Some programmer dude Dec 07 '22 at 09:10
  • I would expect that the MACRO name is something like `__LEAST_TO_MOST_BITFIELD`/`__MOST_TO_LEAST_BITFIELD` instead of `__LITTLE_ENDIAN_BITFIELD`/`__BIG_ENDIAN_BITFIELD`, and not in `byteorder.h` but in `bitfield.h`. "Endian" only refers to byte ordering, not bit-fields allocation ording. ABIs specify them in different sections as well. – YouJiacheng Dec 07 '22 at 09:30
  • In short, why currently linux couple "normal" endianness and "endianness"(which should not be called "endianness") for bitfields? – YouJiacheng Dec 07 '22 at 09:52
  • 1
    It is difficult to answer "why" questions about design decisions or implementation variants. The Linux kernel code uses different `asm/byteorder.h` files for every architecture/platform. Assuming that everything related to bitfields would be moved to a separate file, e.g. `bitfields.h`, there would be one more platform specific file, and code might have to include one more file. What would be the advantage? – Bodo Dec 07 '22 at 10:17
  • 2
    Why are you assuming that the code makes sense at all? IP headers are always big endian. It will always need to be converted to CPU endianess. And this struct here will get completely butchered by the compiler unless padding is disabled, which doesn't seem to be the case either(?). So the only way it can be used would be with manual serialization/deserialization somewhere. After which they might as well use any type for the struct members, since they don't correspond to the IP header memory layout anyway. – Lundin Dec 07 '22 at 10:49

0 Answers0