6

While debugging a problem, the following issue came up. (Please ignore minor code errors; the code is just for illustration.)

The following struct is defined:

typedef struct box_t {
  uint32_t x;
  uint16_t y;
} box_t;

Instances of this struct are being passed by value from function to function (obviously simplified):

void fun_a(box_t b)
{
    ... use b ...
}

void fun_b(box_t bb)
{
    // pass bb by value
    int err = funa(bb);
}

void fun_c(void)
{
    box_t real_b;
    box_t some_b[10];
    ...
    ... use real_b and some_b[]  ...
    ...
    funb(real_b);
    funb(some_b[3]);
    ...
    box_t copy_b = some_b[5];
    ...
}

In some cases, two instances of box_t are compared like this:

 memcmp(bm, bn, sizeof(box_t));

Within several nested calls, the bytes of the box_t arg were dumped using something like this:

char *p = (char*) &a_box_t_arg;
for (i=0; i < sizeof(box_t); i++) {
    printf(" %02X", *p & 0xFF);
    p++;
}
printf("\n");

The sizeof(box_t) is 8; there are 2 pad bytes (discovered as being after the uint16_t). The dump showed that the fields of the struct were equal, but the pad bytes were not; this caused the memcmp to fail (not surprisingly).

The interesting part has been to discover where the 'corrupted' pad values came from. After tracking backwards it was discovered that some of the box_t instances were declared as local variables and were initialized like this:

box_t b;
b.x = 1;
b.y = 2;

The above does not (appear to) initialize the pad bytes, which appear to contain 'garbage' (whatever was in the stack space allocated for b). In most cases the initialization was done using memset(b, 0, sizeof(box_t)).

The question is whether initializing an instance of box_t by either (1) struct assignment or (2) passing by value will always do the equivalent of a memcpy of sizeof(box_t). Is it ever the case that only the 6 bytes of the 'real fields' are copied (and the pad bytes are not).

From the debugging it appears that the memcpy sizeof(box_t) equivalent is always done. Is there anything (e.g., in the standard) that actually specifies this? It would be helpful to know what can be counted on regarding the handling of the pad bytes as debugging goes forward.

Thanks! (Using GCC 4.4.3 on Ubuntu LTS 10.4 64-bit)

For bonus points:

void f(void)
{
    box_t ba;
    box_t bb;
    box_t bc;

The 3 instances are allocated 16 bytes apart while sizeof() shows 8. Why the extra space?

Art Swri
  • 2,799
  • 3
  • 25
  • 36

3 Answers3

5

The value of padding bytes is unspecified (C99/C11 6.2.6.1 §6):

When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values.

See also footnote 42/51 (C99:TC3, C1x draft):

Thus, for example, structure assignment need not copy any padding bits.

The compiler is free to copy or not copy padding as it sees fit. On x86[1], my guess would be that 2 trailing padding bytes will be copied, but 4 bytes won't (which can occur even on 32-bit hardware as structures may require 8-byte alignment, eg to allow atomic reads of double values).

[1] No actual measurements were performed.


To expand on the answer:

The standard doesn't make any guarantees where padding bytes are concerned. However, if you initialize an object with static storage duration, the chance is high that you'll end up with zeroed padding. But if you use that object to initialize another one via assignment, all bets are off again (and I'd expect trailing padding bytes - again, no measurements done - to be particularly good candidates to be omitted from copying).

Using memset() and memcpy() - even when assigning to individual members, as this can invalidate padding as well - is a way to guarantee the values of padding bytes on reasonable implementations. However, in principle the compiler is free to change padding values 'behind your back' any time (which might be related to caching members in registers - wildly guessing again), which you probably can avoid by using volatile storage.

The only reasonably portable workaround I can think of is to specify the memory layout explicitly by introducing dummy members of appropriate size while verifying with compiler-specific means that no additional padding is introduced (__attribute__ ((packed)), -Wpadded for gcc).

Christoph
  • 164,997
  • 36
  • 182
  • 240
  • Does this include initialization with {0}? I.e., box_t b = {0}; is not guaranteed to set any pad bytes to zero? (IOW does 'stored' in the sentence from the standard include or exclude initialization?) – Art Swri Jun 08 '12 at 20:11
  • No guarantees for the padding bits. The compiler is free to handle them as it pleases. – Daniel Fischer Jun 08 '12 at 20:21
  • 1
    @DanielFischer: If the last field in a struct is a `char` called `foo` followed by three bytes of padding and one performs `mystruct.foo++`, would compiler for a 32-bit little-endian system be free to e.g. load `foo` as a 32-bit int, increment it, and store it back as a 32-bit int, provided that it filtered off the upper bits whenever `foo` was promoted to a larger type? – supercat Jun 09 '12 at 21:35
  • @supercat Unless I'm completely misreading the standard, yes, absolutely. – Daniel Fischer Jun 10 '12 at 14:53
  • Zero initialization of a static is required to set padding to zero, FWIW. – saagarjha Dec 01 '21 at 11:05
3

C11 will let you define anonymous structure and union members:

typedef union box_t {
  unsigned char allBytes[theSizeOfIt];
  struct {
    uint32_t x;
    uint16_t y;
  };
} box_t;

That union would behave almost the same as before, you can access .x etc but the default initialization and assignment would change. If you always ensure that your variables are correctly initialized like this:

box_t real_b = { 0 };

or like this

box_t real_a = { .allBytes = {0}, .x = 1, .y = 2 };

All padding bytes should be correctly initialized to 0. This wouldn't help if your integer types would have padding bits, but at least the uintXX_t types that you have chosen will not have them by definition.

gcc and followers implement this already as extension even if they are not yet completely C11.

Edit: In P99 there is a macro to do that in a consistent way:

#define P99_DEFINE_UNION(NAME, ...)                     \
 union NAME {                                           \
   uint8_t p00_allbytes[sizeof(union { __VA_ARGS__ })]; \
   __VA_ARGS__                                          \
 }

That is the size of the array is determined by declaring an "untagged" union just for its size.

Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177
  • Is there a platform independent way to give theSizeOfIt its value? Just set it by #define 'hand-coded' or is there a better way? – Art Swri Jun 08 '12 at 20:15
  • @ArtSwri, unfortunately there is no "easy" way. A possibility would be to declare a `struct` with the same contents beforehand and use `sizeof` that struct. Another one is to verify that you have chosen the correct size afterwards. With C11 you could do `_Static_assert(sizeof(box_t) == theSizeOfIt)`. This would force you to update `theSizeOfIt` whenever you change the contents of the `struct`. – Jens Gustedt Jun 09 '12 at 06:42
2

As Christoph said, there are no guarantees regarding the padding. Your best bet is to not use memcmp to compare two structs. It works at the wrong abstraction level. memcmp works byte-wise at the representation, while you need to compare the values of the members.

Better use a separate compare function that takes two structs and compares each member separately. Something like this:

int box_isequal (box_t bm, box_t bn)
{
    return (bm.x == bn.x) && (bm.y == bn.y);
}

For your bonus, the three objects are separate objects, they are not part of the same array and pointer arithmetic between them is not allowed. As function local variables, they are usually allocated on the stack, and because they are separate the compiler can align them in any way that is best, e.g. for performance.

Secure
  • 4,268
  • 1
  • 18
  • 16