5

I know that memcmp() cannot be used to compare structs that have not been memset() to 0 because of uninitialized padding. However, in my program I have a struct with a few different types at the start, then several dozen of the same type until the end of the struct. My thought was to manually compare the first few types, then use a memcmp() on the remaining contiguous memory block of same typed members.

My question is, what does the C standard guarantee about structure padding? Can I reliably achieve this on any or all compilers? Does the C standard allow struct padding to be inserted between same type members?

I have implemented my proposed solution, and it seems to work exactly as intended with gcc:

#include <stdlib.h>
#include <string.h>
#include <stdio.h>

struct foo
{
    char a;
    void *b;
    int c;
    int d;
    int e;
    int f;
};

static void create_struct(struct foo *p)
{
    p->a = 'a';
    p->b = NULL;
    p->c = 1;
    p->d = 2;
    p->e = 3;
    p->f = 4;
}

static int compare(struct foo *p1, struct foo *p2)
{
    if (p1->a != p2->a)
        return 1;

    if (p1->b != p2->b)
        return 1;

    return
        /* Note the typecasts to char * so we don't get a size in ints. */
        memcmp(
            /* A pointer to the start of the same type members. */
            &(p1->c),
            &(p2->c),
            /* A pointer to the start of the last element to be compared. */
            (char *)&(p2->f)
            /* Plus its size to compare until the end of the last element. */
            +sizeof(p2->f)
            /* Minus the first element, so only c..f are compared. */
            -(char *)&(p2->c)
        ) != 0;
}

int main(int argc, char **argv)
{
    struct foo *p1, *p2;
    int ret;

    /* The loop is to ensure there isn't a fluke with uninitialized padding
     * being the same.
     */
    do
    {
        p1 = malloc(sizeof(struct foo));
        p2 = malloc(sizeof(struct foo));

        create_struct(p1);
        create_struct(p2);

        ret = compare(p1, p2);

        free(p1);
        free(p2);

        if (ret)
            puts("no match");
        else
            puts("match");
    }
    while (!ret);

    return 0;
}
John
  • 2,015
  • 5
  • 23
  • 37

2 Answers2

4

There is no guarantee of this in the C standard. From a practical standpoint it's true as part of the ABI for every current C implementation, and there seems to be no purpose in adding padding (e.g. it could not be used for checking against buffer overflows, since a conforming program is permitted to write to the padding). But strictly speaking it's not "portable".

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
0

Sadly, there is no C standard (that I have ever heard of) that allows you to control structure padding. There is the fact that automatic allocation that is initialized like this

struct something val = { 0 };

will cause all the members in val to be initialized to 0. But the padding in between is left to the implementation.

There are compiler extensions you can use like GCC's __attribute__((packed)) to eliminate most if not all structure padding, but aside from that you may be at a loss.

I also know that without major optimizations in place, most compilers won't bother to add structure padding in most cases, which would explain why this works under GCC.

That said, if your structure members cause odd alignment issues like this

struct something { char onebyte; int fourbyte; }; 

they will cause the compiler to add padding after the onebyte member to satisfy the alignment requirements of the fourbyte member.

randomusername
  • 7,927
  • 23
  • 50
  • 1
    This: `struct something val = { 0 };` initializes the first member to 0 and then default initializes the remaining members (possibly with 0 if that is their default). `struct something val = {};` default inializes all members which is more generic because the first item may or may not be an integral member. – Jerry Jeremiah Dec 17 '13 at 01:10
  • @JerryJeremiah true, but this gets the idea across better. – randomusername Dec 17 '13 at 01:12
  • When investigating with `gdb` I found that there were 7 padding bytes added after `char a`, making the entire struct 32 bytes on my system (instead of 25, which was the case with `__attribute__((__packed__))`). When using a simple `memcmp()` on the entire struct, they were of course not equal. – John Dec 17 '13 at 01:17
  • @Smith So there you go. `void *` requires an alignment of `8` on most 64 bit systems, so your compiler padded the `char` sized member to `8` bytes to make sure that the pointer got the alignment it needed. And everything else just got packed together the way it's supposed to be. – randomusername Dec 17 '13 at 01:23
  • 1
    @JerryJeremiah I'd be hesitant to call the empty initializer more generic as it's a GNU C extension. – tab Dec 17 '13 at 01:47
  • As tab said, `{}` is invalid C; `{0}` is the correct universal zero initializer and using it has nothing to do with whether the first member has integer type (or even whether the object being initialized is a structure). `{0}` simply works for initializing *any type*. – R.. GitHub STOP HELPING ICE Dec 17 '13 at 05:19