20

can I trust that the C compiler does modulo 2^n each time I access a bit field? Or is there any compiler/optimisation where a code like the one below would not print out Overflow?

struct {
  uint8_t foo:2;
} G;

G.foo = 3;
G.foo++;

if(G.foo == 0) {
  printf("Overflow\n");
}

Thanks in Advance, Florian

Florian
  • 203
  • 2
  • 5
  • what do you mean? You explicitly asked it to store data in two bits. Also, 3 + 1 == 0 mod 4 – Foo Bah Feb 05 '11 at 17:22
  • I would be more concerned about what happens to that bit if you've got other bitfields declared in that struct. e.g. uint8_t bar:2; uint8_t foo:2; uint8_t poo:2; would it affect bar or poo? – Dave Feb 05 '11 at 17:31
  • 1
    Charles, it is only using 2 bits for storage, so it can only store 0, 1, 2, or 3. – Dave Feb 05 '11 at 17:51

4 Answers4

19

Yes, you can trust the C compiler to do the right thing here, as long as the bit field is declared with an unsigned type, which you have with uint8_t. From the C99 standard §6.2.6.1/3:

Values stored in unsigned bit-fields and objects of type unsigned char shall be represented using a pure binary notation.40)

From §6.7.2.1/9:

A bit-field is interpreted as a signed or unsigned integer type consisting of the specified number of bits.104) If the value 0 or 1 is stored into a nonzero-width bit-field of type _Bool, the value of the bit-field shall compare equal to the value stored.

And from §6.2.5/9 (emphasis mine):

The range of nonnegative values of a signed integer type is a subrange of the corresponding unsigned integer type, and the representation of the same value in each type is the same.31) A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.

So yes, you can be sure that any standards-conforming compiler will have G.foo overflow to 0 without any other unwanted side effects.

Adam Rosenfield
  • 390,455
  • 97
  • 512
  • 589
  • As pointed out to me by R.. in a comment below, the computation is actually a signed computation. That doesn't change the answer here but it would, say, with a 31-bit unsigned bitfield on an architecture where ints are 32-bit (where there could be an undefined signed overflow). – Pascal Cuoq Feb 05 '11 at 20:07
3

No. The compiler allocates 2 bits to the field, and incrementing 3 results in 100b, which when placed in two bits results in 0.

Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
1

Yes. We can get the answer from assembly. Here is a example I code in Ubuntu 16.04, 64bit, gcc.

#include <stdio.h>

typedef unsigned int uint32_t;

struct {
  uint32_t foo1:8;
  uint32_t foo2:24;
} G;

int main() {
    G.foo1 = 0x12;
    G.foo2 = 0xffffff; // G is 0xfffff12
    printf("G.foo1=0x%02x, G.foo2=0x%06x, G=0x%08x\n", G.foo1, G.foo2, *(uint32_t *)&G);
    G.foo2++; // G.foo2 overflow
    printf("G.foo1=0x%02x, G.foo2=0x%06x, G=0x%08x\n", G.foo1, G.foo2, *(uint32_t *)&G);
    G.foo1 += (0xff-0x12+1); // // G.foo1 overflow
    printf("G.foo1=0x%02x, G.foo2=0x%06x, G=0x%08x\n", G.foo1, G.foo2, *(uint32_t *)&G);
    return 0;
}

Compile it with gcc -S <.c file>. You can get the assembly file .s. Here I show the assembly of G.foo2++;, and I write some comments.

movl    G(%rip), %eax
shrl    $8, %eax    #  0xfffff12-->0x00ffffff
addl    $1, %eax    # 0x00ffffff+1=0x01000000
andl    $16777215, %eax # 16777215=0xffffff, so eax still 0x01000000
sall    $8, %eax    # 0x01000000-->0x00000000
movl    %eax, %edx  # edx high-24bit is fool2
movl    G(%rip), %eax   # G.foo2, tmp123
movzbl  %al, %eax   # so eax=0x00000012
orl     %edx, %eax  # eax=0x00000012 | 0x00000000 = 0x00000012
movl    %eax, G(%rip)   # write to G

We can see that compiler will use shift instructions to ensure what you say.(note: here's memory layout of G is:

----------------------------------
|     foo2-24bit     | foo1-8bit |
----------------------------------

Of course, the result of aforementioned is:

G.foo1=0x12, G.foo2=0xffffff, G=0xffffff12
G.foo1=0x12, G.foo2=0x000000, G=0x00000012
G.foo1=0x00, G.foo2=0x000000, G=0x00000000
LittleSec
  • 11
  • 2
0

Short answer: yes, you can trust modulo 2^n to happen.

In your program, G.foo++; is in fact equivalent to G.foo = (unsigned int)G.foo + 1.

Unsigned int arithmetic always produces 2^(size of unsigned int in bits) results. The two bits of least weight are then stored in G.foo, producing zero.

Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281
  • Your equivalence is wrong. `uint8_t` will be promoted to `int`, not `unsigned int`. `G.foo++;` is equivalent to `G.foo = ((int)G.foo + 1) % 4;`. – R.. GitHub STOP HELPING ICE Feb 05 '11 at 18:14
  • @R.. You are right (I always get these promotions wrong), but then there is something missing from Adam's answer above, which is that the +1 is actually taking place between signed ints. – Pascal Cuoq Feb 05 '11 at 20:02