7

Assume I have a sample source file, test.c, which I am compiling like so:

$ gcc -03 -Wall

test.c looks something like this ..

/// CMP128(x, y)
//
// arguments
//  x - any pointer to an 128-bit int
//  y - any pointer to an 128-bit int
//
// returns -1, 0, or 1 if x is less than, equal to, or greater than y
//
#define CMP128(x, y) // magic goes here

// example usages

uint8_t  A[16];
uint16_t B[8];
uint32_t C[4];
uint64_t D[2];
struct in6_addr E;
uint8_t* F;

// use CMP128 on any combination of pointers to 128-bit ints, i.e.

CMP128(A, B);
CMP128(&C[0], &D[0]);
CMP128(&E, F);

// and so on

let's also say I accept the restriction that if you pass in two overlapping pointers, you get undefined results.

I've tried something like this (imagine these macros are properly formatted with backslash-escaped newlines at the end of each line)

#define CMP128(x, y) ({
  uint64_t* a = (void*)x;
    uint64_t* b = (void*)y;

  // compare a[0] with b[0], a[1] with b[1]
})

but when I dereference a in the macro (a[0] < b[0]) I get "dereferencing breaks strict-aliasing rules" errors from gcc

I had thought that you were supposed to use unions to properly refer to a single place in memory in two different ways, so next I tried something like

#define CMP128(x, y) ({
    union {
        typeof(x) a;
        typeof(y) b;
        uint64_t* c;
    }   d = { .a = (x) }
        , e = { .b = (y) };

    // compare d.c[0] with e.c[0], etc
})

Except that I get the exact same errors from the compiler about strict-aliasing rules.

So: is there some way to do this without breaking strict-aliasing, short of actually COPYING the memory?

(may_alias doesnt count, it just allows you to bypass the strict-aliasing rules)

EDIT: use memcmp to do this. I got caught up on the aliasing rules and didn't think of it.

Vilhelm Gray
  • 11,516
  • 10
  • 61
  • 114
Todd Freed
  • 877
  • 6
  • 19
  • 1
    To use unions in a standard-conforming way, you are only allowed to read from the member that you last wrote to. – Kerrek SB Jun 26 '11 at 21:26
  • 8
    @Kerrek: not true - C99 allows type-punning through unions, a footnote which mentions this explicitly was added with TC3; however, Todd's code is still incorrect... – Christoph Jun 26 '11 at 21:32
  • @Christoph: C1x is even better, appendix J (UB) is fixed, to allow reading the bytes corresponding to the last written member (which obviously was the purpose in C99, but apparently appendix J was overlooked then). – ninjalj Jun 29 '11 at 01:20
  • `typeof(x)` is a GCC extension. We're not discussing standard C here, so it's useless to reason from a "standards-conforming" perspective. –  Oct 14 '16 at 13:36

1 Answers1

5

The compiler is correct as the aliasing rules are determined by the so-called 'effective type' of the object (ie memory location) you're accessing, regardless of any pointer magic. In this case, type-punning the pointers with a union is no different than an explicit cast - using the cast is actually preferable as the standard does not guarantee that arbitary pointer types have compatible representations, ie you're unnecessarily depending on implementation-defined behaviour.

If you want to conform to the standard, you need to copy the data to new variables or use a union during the declaration of the original variables.

If your 128-bit integers are either big-endian or little-endian (ie not mixed-endian), you could also use memcmp() (either directly or after negating the return value) or do a byte-wise comparison yourself: access through pointers of character type is an exception to the aliasing rule.

Christoph
  • 164,997
  • 36
  • 182
  • 240