2

I'm comparing using memcmp() two variables of the same struct (the struct has union in it). The variables are in two arrays and I'm running a loop where each iteration I do memcmp(&arr1[i], &arr2[i], sizeof(arrtype)).

When debugging I see that memcmp returns -1, but looking at the two variables and their values, I see that the variables has equal values. These arrays are zeroed with memset at the beginning.

  1. So does anybody know why memcmp returns -1 and not 0?
  2. Is there a better way to do what I need (compare two memory blocks)?

code:

typedef struct type1 {
    int version;
    union {
            option1_t opt1;
            option2_t opt2;
    } union_t;
} type1_t;

typedef struct type0 {
    type1_t member1;
    type2_t member2;
    type3_t member3;
    type4_t member4;
    type5_t member;
} type0_t;


type0_t arr1[SIZE];
type0_t arr2[SIZE];

memset(arr1, 0, SIZE * sizeof(type0_t));
memset(arr2, 0, SIZE * sizeof(type0_t));

/* doing irrelevant stuff... */

/* get values into arr1, arr2 ... */


/* comparing both arrays in for loop*/
value = memcmp(&arr1[i], &arr2[i], sizeof(type0_t));
theWizard
  • 111
  • 2
  • 10
  • could you please explain your problem little bit more or post some related code and what is arrtype ?? as far as returning -1 is concern it simply means that 'arr1' is less than 'arr2'. – Suryakant Sharma Jul 15 '13 at 09:54
  • That's fairly surprising union definition. Does that compile? I'd expect the compiler to complain about duplicate `opt1` name, at least. – sehe Jul 15 '13 at 12:05
  • @sehe my bad, fixed now to option2_t opt2 – theWizard Jul 15 '13 at 12:09

3 Answers3

9

You are likely reading indeterminate values (unitialized memory, or memory overwritten to contain unspecified data).

E.g. you could be accessing a member of a union that wasn't the member last written. Even if you don't, the last-written member might be smaller than the total extents of the union, leading to 'indeterminate' data beyond that size.

struct X { 
    union {
         char field1;
         long long field2[10];
    };
};

struct X a,b;
a.field1 = 'a';
b.field1 = 'a';

You can't expect a and b to compare equal bitwise because you never initialized all the bits in the first place (field2 has many more bits in excess of field1)

---Depending on the value of uninitialized memory also invokes Undefined Behaviour.--- Not true for C11

sehe
  • 374,641
  • 47
  • 450
  • 633
  • 3
    accessing a union member that wasn't the one last written is perfectly fine in C (as long as you don't violate other constraints); the part about different sizes of union members is important, though; in particular, writing to the shorter member `field1` may in principle invalidate all bytes of `field2`, and not only the ones they share – Christoph Jul 15 '13 at 10:26
  • sehe, i'm using memset to zero all the struct first – theWizard Jul 15 '13 at 10:38
  • @theWizard That's not really the point. You could of course show a minimal example of what you actually do. Perhaps the solution will present itself to you when you make it concrete. – sehe Jul 15 '13 at 10:49
  • 5
    There is no undefined behavior here. (1) Passing the address of a struct to `memcmp` does not constitute accessing members of its unions. (2) Accessing a member of the union other than the last-stored member is not generally undefined; the bytes would be reinterpreted, per C 2011 note 95, referring to clause 6.2.6. (3) Applying `memcmp` to any object may yield an undesired answer, but it does not “invoke undefined behavior”, even if memory is uninitialized. – Eric Postpischil Jul 15 '13 at 11:21
  • @EricPostpischil Ok. I reworded (I'm curious what the C99 specs said there, by the way) – sehe Jul 15 '13 at 12:00
  • @EricPostpischil how do you recommend doing to comparison then? compare each member separately? – theWizard Jul 15 '13 at 12:13
  • @theWizard: Yes, you should compare structures member-by-member, unless you have additional information (beyond what the C standard specifies) that ensures `memcmp` will work. This could include specifications from your C implementation about padding in structures, representations of the types involved, and so on. You have not shown what `option1_t` or `option2_t` are. – Eric Postpischil Jul 15 '13 at 13:10
  • @sehe: C 1999 was essentially the same in this regard. C 2011 added a statement that using the value of uninitialized automatic object that could have been declared `register` has undefined behavior. That change is not relevant here, since taking the address (to pass to `memcmp`) prevents `register` from being allowed. – Eric Postpischil Jul 15 '13 at 13:13
  • @EricPostpischil Great. Thanks for the standards information! – sehe Jul 15 '13 at 13:49
  • For the terminology, what you call "undefined data" is call "indeterminate value" of an object in the C standard jargon. – Jens Gustedt Jul 16 '13 at 06:56
  • Thanks a lot. I hadn't done ZeroMemory() before initializing and comparing. – Gautam Jain May 19 '21 at 08:49
4

If you are using structures, there can be padding bytes in between your member fields. You might try to memeset the whole structure to 0 before you start using it.

Devolus
  • 21,661
  • 13
  • 66
  • 113
  • 4
    That might be, you didn't post some code. But if you access the variables and change them, then it depends on the union how many bytes are changed. The compiler will only change what is relevant to the actual code, when you later do a memcmp, the bytes from the "other" union structure will still be there unchanged. YOu can only use memcmp, if you are sure that only the same members were changed in the meantime, otherwise it will be undefined. – Devolus Jul 15 '13 at 09:33
2

Using memcmp to compare two objects may fail for three reasons:

  1. A struct may contain padding bytes whose values are not controlled.
  2. A union may contain bytes that do not correspond to bytes of the last-stored member (e.g., the additional bytes for one of the longer members of the union).
  3. Types may have equal values with different representations (padding bits within the type, different encodings for +0 and –0, et cetera).

Unless you have taken steps to ensure that none of these problems interferes with comparing via memcmp, then the proper way to compare two structures is to compare them member-by-member.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312