Context
“Strict aliasing”, named after the GCC optimization, is an assumption by the compiler that a value in memory will not be accessed through an lvalue of a type (the “declared type”) very different from the type the value was written with (the “effective type”). This assumption allows code transformations that would be incorrect if the possibility had to be taken into account that writing to a pointer to float
could modify a global variable of type int
.
Both GCC and Clang, extracting the most meaning out of a standard description full of dark corners, and having a bias for performance of generated code in practice, assume that a pointer to the int
first member of a struct thing
does not alias a pointer to the int
first member of a struct object
:
struct thing { int a; };
struct object { int a; };
int e(struct thing *p, struct object *q) {
p->a = 1;
q->a = 2;
return p->a;
}
Both GCC and Clang infer that the function always return 1, that is, that p
and q
cannot be aliases for the same memory location:
e:
movl $1, (%rdi)
movl $1, %eax
movl $2, (%rsi)
ret
As long as one agrees with the reasoning for this optimization, it should be no surprise that p->t[3]
and q->t[2]
are also assumed to be disjoint lvalues in the following snippet (or rather, that the caller causes UB if they alias):
struct arr { int t[10]; };
int h(struct arr *p, struct arr *q) {
p->t[3] = 1;
q->t[2] = 2;
return p->t[3];
}
GCC optimizes the above function h
:
h:
movl $1, 12(%rdi)
movl $1, %eax
movl $2, 8(%rsi)
ret
So far so good, as long as one sees p->a
or p->t[3]
as somehow accessing a whole struct thing
(resp. struct arr
), it is possible to argue that making the locations alias would break the rules laid out in 6.5:6-7. An argument that this is GCC's approach is this message, part of a long thread that also discussed the role of unions in strict aliasing rules.
Question
I have doubts, however, about the following example, in which there is no struct
:
int g(int (*p)[10], int (*q)[10]) {
(*p)[3] = 1;
(*q)[4] = 2;
return (*p)[3];
}
GCC versions 4.4.7 through the current version 7 snapshot on Matt Godbolt's useful website optimize function g
as if (*p)[3]
and (*q)[4]
could not alias (or rather, as if the program had invoked UB if they did):
g:
movl $1, 12(%rdi)
movl $1, %eax
movl $2, 16(%rsi)
ret
Is there any reading of the standard that justifies this very strict approach to strict aliasing? If GCC's optimization here can be justified, would the arguments apply as well to the optimization of functions f
and k
, which are not optimized by GCC?
int f(int (*p)[10], int (*q)[9]) {
(*p)[3] = 1;
(*q)[3] = 2;
return (*p)[3];
}
int k(int (*p)[10], int (*q)[9]) {
(*p)[3] = 1;
(*q)[2] = 2;
return (*p)[3];
}
I'm willing to take this up with the GCC devs, but I should first decide without I am reporting a correctness bug for function g
or a missed optimization for f
and k
.