In c, how is uint8_t structure_member[4] different than uint32_t structure_member in regards to structure padding?

Question

FYI: This is my first question using stackoverflow!

The code is as follows:

uint8_t TestVar1;
uint8_t TestVar2;


typedef struct
{
  uint8_t member1;
  uint32_t member2;
}Test1;

typedef struct
{
  uint8_t member1;
  uint8_t member2[4];
}Test2;

Test1 TestStruct1; 
Test2 TestStruct2; 

TestVar1 = sizeof(TestStruct1); /*size is 8*/
TestVar2 = sizeof(TestStruct2); /*size is 5*/

I thought I understood padding but I can't explain TestVar2. I can explain TestVar1 being 8 bytes because there is 3 padding bytes as part of uint8_t member1.

However, with struct test2 are there no padding bytes? (Apparently not) Could someone provide some insight as to what is happening in struct test2 case?

As a side note, I am aiming for 5 bytes but I don't know why the second case works. Is the array breaking down to a pointer or something? Is this safe(standard practice) to do?

Thanks!

There is no standard requirement for a specific padding. But in this case it can be related to the fact that `uint32_t` has to be 4-bytes aligned to work efficiently (or work at all) — Eugene Sh., Nov 02 '17 at 15:39
There's no need to pad the struct with the `uint8_t[4]` as it is made up of a bunch of one-byte types that require no special alignment for efficiency. — Christian Gibbons, Nov 02 '17 at 15:42
Really no guarantee, Test1 might be 8 bytes, Test2 too, but with padding of member1 in Test1 and padding at the end in Test2. Also the byte order in a uint32_t might be little endian or big endian (or even other exotic byte orders). — Joop Eggen, Nov 02 '17 at 15:42
@ChristianGibbons with uint8_t[4] what does the memory look like for the structure? When accessing the 4 byte array, would you just typecast the array to a uint32_t*? or simply memcpy it to a 32 bit integer? In other words, in general, why would I bother using a 32 bit integer as a structure member as opposed to using byte arrays? The application is for a communication protocol. Seems like I can simply "sizeof" the structure and send it out the door. (out the door means transmit the data lol) — SirMoses, Nov 02 '17 at 17:48
I suspected communication might be coming into play with the concern over padding. I believe typecasting the `uint8_t` array to `uint32_t *` would be a violation of C's strict aliasing and therefore be undefined behavior. I believe the best choice would be to go the `__attribute__((packed))` route to remove the padding. And I would also put the `uint32_t` member at the start of the struct so that it will be 4-byte aligned. So something like this: `typedef struct __attribute__((packed)) { uint32_t member2; uint8_t member1; }test;` — Christian Gibbons, Nov 02 '17 at 18:03
@ChristianGibbons Yes I see your point about typecasting. Is there any adverse behavior with memcpy'ing the data? And yes, I could change the ordering but that doesn't satisfy the specific communication protocol that I am supporting. With this in mind I was thinking of just memcpy'ing. — SirMoses, Nov 02 '17 at 18:08
I do not believe there is a specific issue with using `memcpy` to move data into a byte-array. I would be curious as to the difference in efficiency between using that and just assigning to the unaligned `uint32_t` were you to use `__attribute__((packed))` without reordering the members. — Christian Gibbons, Nov 02 '17 at 18:13
@ChristianGibbons I believe some architectures does not support unaligned access in a structure. It is my understanding some do (with a time penalty) while others do not support it whatsoever. — SirMoses, Nov 02 '17 at 18:28
It seems you are correct on that matter. SPARC, specifically, will give a bus error trying to access misaligned variables. https://stackoverflow.com/a/8568441/8513665 — Christian Gibbons, Nov 02 '17 at 20:58

Christian Gibbons · Accepted Answer · 2017-11-02T16:20:13.960

uint8_t has no specific alignment requirements and thus does not require any padding to align it while uint32_t, being a multi-byte type, will want to be aligned on a 4-byte boundary. If you absolutely must have a struct with a uint8_t and uint32_t that takes only 5 bytes, you can use __attribute__((packed)) to tell the compiler to forego the padding (which I would carefully consider whether the space saved is worth misalignment):

typedef struct __attribute__((packed)) {
    uint8_t member1;
    uint32_t member2;
}test;

Another thing to consider is the ordering of the struct members. To keep a reduced struct size, put the largest members at the beginning of the struct as they will have the strictest alignment needs. Consider the following:

typedef struct {
    uint8_t w;
    uint32_t x;
    uint8_t y;
    uint32_t z;
}test2;

typedef struct {
    uint32_t x;
    uint32_t z;
    uint8_t w;
    uint8_t y;
}test3;

In order to keep the uint32_t members aligned, test2 will put padding after w and y, while test3 will already have the uint32_t members aligned and only put enough padding at the end of the struct so that the next free memory segment aligns with a 4-bytes boundary (as that is the strictest boundary of any of the struct's members). Therefore test2 will have a size of 16 bytes while test3 will have a size of 12 bytes.

Note: I have not seen where, if at all, this is defined in the C specs, but the results hold true for both gcc and clang in my tests.

In c, how is uint8_t structure_member[4] different than uint32_t structure_member in regards to structure padding?

1 Answers1