2

According to the C11 standard (mentioned in this answer), the standard forces the following types to be supported: _Bool, signed int and unsigned int. Other types can be supported but it is up to the implementation.

I tried to following code to see what are the types of the bit-fields in practice:

#include <stdint.h>
#include <assert.h>
#include <stdio.h>

#define ARG_TYPE(arg)     _Generic((arg),               \
                                _Bool          : "_Bool", \
                                char           : "char",      \
                                signed char    : "signed char",    \
                                unsigned char  : "unsigned char", \
                                short          : "short", \
                                unsigned short : "unsigned short", \
                                int            : "int", \
                                unsigned int   : "unsigned int", \
                                long           : "long", \
                                unsigned long  : "unsigned long", \
                                long long      : "long long", \
                                unsigned long long : "unsigned long long")
int main(void)
{
    struct _s
    {
        unsigned int        uval32 : 32;
        unsigned int        uval16 : 16;
        unsigned int        uval8  : 8;
        unsigned int        uval1  : 1; 
        signed int          ival32 : 32;
        signed int          ival16 : 16;
        signed int          ival8  : 8;
        signed int          ival1  : 1;
        _Bool               bool1  : 1;
    } s = {0};

    printf("The type of s.uval32 is %s\n", ARG_TYPE(s.uval32));
    printf("The type of s.uval16 is %s\n", ARG_TYPE(s.uval16));
    printf("The type of s.uval8 is %s\n", ARG_TYPE(s.uval8));
    printf("The type of s.uval1 is %s\n", ARG_TYPE(s.uval1));
    printf("The type of s.ival32 is %s\n", ARG_TYPE(s.ival32));
    printf("The type of s.ival16 is %s\n", ARG_TYPE(s.ival16));
    printf("The type of s.ival8 is %s\n", ARG_TYPE(s.ival8));
    printf("The type of s.ival1 is %s\n", ARG_TYPE(s.ival1));
    printf("The type of s.bool1 is %s\n", ARG_TYPE(s.bool1));

    (void)s;

    return 0;
}

Clang (https://godbolt.org/z/fjVRwI) and ICC (https://godbolt.org/z/yC_U8C) behaved as expected:

The type of s.uval32 is unsigned int
The type of s.uval16 is unsigned int
The type of s.uval8 is unsigned int
The type of s.uval1 is unsigned int
The type of s.ival32 is int
The type of s.ival16 is int
The type of s.ival8 is int
The type of s.ival1 is int
The type of s.bool1 is _Bool

But GCC (https://godbolt.org/z/FS89_b) introduced several issues:

  1. A single bit bit-field defined other than _Bool didn't fit any of the types introduced in the _Generic:

error: '_Generic' selector of type 'unsigned char:1' is not compatible with any association

  1. After commenting out the lines which issued errors I got this:

    The type of s.uval32 is unsigned int
    The type of s.uval16 is unsigned short
    The type of s.uval8 is unsigned char
    The type of s.ival32 is int
    The type of s.ival16 is short
    The type of s.ival8 is signed char
    The type of s.bool1 is _Bool
    

    To me, unsigned short, short, unsigned char and signed char are completely unexpected here.

Did I misunderstand the standard? Is this a GCC bug?

Looks like using _Generic even for well defined stuff is not portable...

Alex Lop.
  • 6,810
  • 1
  • 26
  • 45

2 Answers2

4

As noted, no compiler has to provide support for oddball bit-field types. If it does, it is free to treat such types as it pleases - this is beyond the scope of the standard. You are essentially talking about the type of the abstract item referred to as "storage unit" by the standard.

Everything about this magic abstract "storage unit" is poorly-specified behavior:

C17 §6.7.2.1/11:

An implementation may allocate any addressable storage unit large enough to hold a bitfield. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.

Simply never use bit-fields anywhere and all these problems will go away. There is never a reason to use them anyhow - it is a 100% superfluous feature.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • 1
    I don't think that this has anything to do with the storage unit. `_Generic` is about the type, and not about the representation. But I completely agree with your last para :) – Jens Gustedt Sep 26 '19 at 09:15
  • @JensGustedt The thing is, if you have a bit-field of `uint8_t` followed by one of `uint16_t`, there is no telling where the individual bytes end up in storage units. The storage unit at least ought to have some sort of relation with alignment. My take is that clang and icc treats the storage unit as the smallest aligned type (unsigned int) and then show the various individual bit-field members into that one. – Lundin Sep 26 '19 at 11:49
4

Yes, clang is correct here and gcc is plain wrong. The type of a bit-field is the one that is defined. Period. There is no ambiguity in the standard about this, and gcc's "feature" to have them as specific types that includes the number of specified bits is non-conforming. There has been a long discussion that starts at

https://gcc.gnu.org/ml/gcc/2016-02/msg00255.html

which basically shows that they are not willing to concede and change to a more user-friendly mode.

If you are really interested in practical aspects of this, you could just use one of the methods that force evaluation, such as + or with a comma operator. This would lose the distinction between _Bool and int bit-fields, but still could give you the distinction between long and int.

Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177
  • I aware of `+` option but if `arg` is not necessarily an arithmetic type then it won't help... – Alex Lop. Sep 26 '19 at 09:21
  • 2
    The standard says "A bit-field is interpreted as having a signed or unsigned integer type consisting of the specified number of bits.", so I do not see how a bitfield of width 3 (for example) should match `unsigned int` in a generic selector. – M.M Sep 26 '19 at 09:30
  • @AlexLop. in that case you can use the comma operator doing something weird such as `(0, X)` in the controlling expression. – Jens Gustedt Sep 26 '19 at 09:40
  • @M.M yes exactly. It says that it is *interpreted* as having the specified number of bits. Nowhere in the lengthy discussion about bit-fields and how they are stored, it says that the type is something different than the declared type. – Jens Gustedt Sep 26 '19 at 09:47
  • How the comma operator will help? It doesn't force integer promotion, the result remains the same https://godbolt.org/z/UsuYaZ – Alex Lop. Sep 26 '19 at 10:32
  • @alex It should, the expressions in a comma expression are evaluated and must thus be promoted. – Jens Gustedt Sep 26 '19 at 16:38
  • @JensGustedt ... but _Generic doesn't perform evaluation, right? It probably uses the type of the rightmost argument because this is what actually defines the type of the comma separated expression. – Alex Lop. Sep 26 '19 at 16:56
  • Generic uses the type of the whole comma expression, which in this case is the promoted type. – Jens Gustedt Sep 26 '19 at 17:13
  • 1
    @JensGustedt it doesn't work so with any of the compilers (GCC, ICC, CLANG): https://godbolt.org/z/a43X9S Also the standard says: "*The controlling expression of a generic selection is not evaluated*" – Alex Lop. Sep 26 '19 at 17:59
  • Interesting. It really needs not to be evaluated to have an influence on the type. So comma operator is even weirder than I thought, resulting in an rvalue but not undergoing promotions (?). Anyway, if that doesn't do you can use `(1 ? (arg) : (arg))`. There it explicitly says that the result type is determined by the usual arithmetic conversions. – Jens Gustedt Sep 26 '19 at 19:34