1

int is 4 bytes on my machine, long 8 bytes, etc.

Hey, so I've encountered a pretty interesting thing in C and started wondering how structures manage their data inside. I thought it works like an array, but oh boy, I was wrong. So basically, I thought that the data inside sums up itself, but I've found out on stack overflow, that some compilers might do some optimizations due to processor's architecture requirements. And there come alignments. I've found two links about alignments, and I've wanted to calculate my struct's size and I've experimented a bit, but I think I understand that in some ways, and in some not. That's why I wanted to create that topic, since I couldn't fully grasp some of the examples provided by people who were answering in those topics. For example:

#include <stdio.h>

struct test {
    char a;
    char b;
    int c;
    long d;
    int e;
};

int main(void){
    printf("test = %d\n", sizeof(test));
    return 0;
}

Output:

test = 24

I was expecting the compiler to do an optimization like this: char a is 1 byte, char b is 1 byte, thus we don't need to align. char b is 1 byte, int c is 4 bytes, thus we need to align 3 bytes. int c is 4 bytes, long d is 8 bytes, thus we need to align 4 bytes. long d is 8 bytes, int e is 4 bytes, thus we need to align 4 bytes. And till this point the total size is 29. Rounding it with ceiling to the nearest even number gives 30. Why it is 24 then?

I've also found out that the char a + char b give a padding equal to 2 bytes, so we only need to align 2 more bytes, thus maybe that's where I'm making a mistake. Also if I add more variables:

#include <stdio.h>

struct test {
    char a;
    char b;
    int c;
    long d;
    int e;
    char f;
    char g;
    char h;
    char i;
};

int main(void){
    printf("test = %d\n", sizeof(test));
    return 0;
}

Output:

test = 24

The total size is still 24 bytes. But if I add one more variable:

#include <stdio.h>

struct test {
    char a;
    char b;
    int c;
    long d;
    int e;
    char f;
    char g;
    char h;
    char i;
    char j;
};

int main(void){
    printf("test = %d\n", sizeof(test));
    return 0;
}

Output:

test = 32

The size changes to total of 32 bytes. Why? What exactly happens? Sorry if an answer for that question is pretty obvious for you, but I truly don't understand. Also I don't know if that differs between compilers, so if I didn't provide some information, just tell me and I will add that.

todovvox
  • 170
  • 10
  • 2
    You can output a pointer value (i.e. an address) using `printf("%p", addr)`, if I remember correctly. Using that, you can find out at which position in memory each of the members of the struct are. – Ulrich Eckhardt Nov 21 '21 at 13:59

1 Answers1

2

It all comes down to alignment. The compiler wants to keep each element aligned to an address that's a multiple of that item's size, because the hardware can access it most efficiently that way. (And on some architectures, the hardware can only access it that way); unaligned access are disallowed.)

You've got one element in your structure that's a long int of size 8, so its alignment is going to drive everything else. Here's how your first structure would be laid out:

   0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
 0 | a | b |  pad  |       c       |
   +---+---+---+---+---+---+---+---+
 8 |               d               |
   +---+---+---+---+---+---+---+---+
16 |       e       |     padding   |
   +---+---+---+---+---+---+---+---+

So, as you can see, the size is 24, including two invisible, unnamed "padding" fields of 2 and 4 bytes, respectively.

Structure padding and alignment can be confusing. (It took me an embarrassingly large number of tries to get this answer right.) Fortunately, you usually don't have to worry about any of this, because it's the compiler's problem, not yours.

You can get the compiler to tell you how it's laying a structure out by using the offsetof macro:

int main(void){
    printf("a @ %zd\n", offsetof(struct test, a));
    printf("b @ %zd\n", offsetof(struct test, b));
    printf("c @ %zd\n", offsetof(struct test, c));
    printf("d @ %zd\n", offsetof(struct test, d));
    printf("e @ %zd\n", offsetof(struct test, e));
    printf("size = %zd\n", sizeof(struct test));
    return 0;
}

On my machine (which seems to be behaving the same as yours) this prints:

a @ 0
b @ 1
c @ 4
d @ 8
e @ 16
size = 24

Notice that I have used %zd instead of %d, since sizeof and offsetof give their answers as type size_t, not int.

When you added char fields f, g, h, and i, they could fit into the second padding space, without making the overall structure any bigger. It was only when you added j that it pushed things over into another 8-byte chunk:

   0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
 0 | a | b |  pad  |       c       |
   +---+---+---+---+---+---+---+---+
 8 |               d               |
   +---+---+---+---+---+---+---+---+
16 |       e       | f | g | h | i |
   +---+---+---+---+---+---+---+---+
24 | j |          padding          |
   +---+---+---+---+---+---+---+---+
Steve Summit
  • 45,437
  • 7
  • 70
  • 103
  • Alright, I see.. So what happens in the second example, where the output was 32? Why adding one of the variables, like `char f` still gave an output of 24? And it still gave an output of 24 until I reached a letter of `j` in variable `char j`? Until then, it was 24. After giving my struct the variable `char j` it hit 32 bytes out of sudden. What happened? It was stuffing bytes in that left out padding after `int e`? So a total of 4 bytes fit in? It would explain why it happened after exactly 4 bytes of char. Am I right? – todovvox Nov 21 '21 at 13:40
  • 1
    @dippie See the expanded answer. – Steve Summit Nov 21 '21 at 13:46
  • I see, I see. So the best tactic in optimization is basically giving struct the types from largest to smallest, am I right? Because, for example declaring `short a`, then `int b`, then `char c` make a lot more bytes in padding than in `int a`, `short b`, `char c`. In the first example it gives 12 bytes in total on a machine where int's size is equal to 4 bytes, and in the second example it gives 8 bytes. Is it right, or am I missing something? – todovvox Nov 21 '21 at 13:52
  • 1
    @dippie If by "optimization you mean "not wasting space", I think you're right. See the additional answers by Eric Raymond at the end of [question 2.12](http://c-faq.com/struct/padding.html) in the [C FAQ list](http://c-faq.com/). See also [question 2.13](http://c-faq.com/struct/endpad.html). – Steve Summit Nov 21 '21 at 13:58
  • Thank you so much for help. Now I fully understand that topic. – todovvox Nov 21 '21 at 13:59