If one is interested in knowing what constructs will be reliably processed by the clang and gcc optimizers in the absence of the -fno-strict-aliasing
, rather than assuming that everything defined by the Standard will be processed meaningfully, both clang and gcc will sometimes ignore changes to active/effective types made by operations or sequences of operations that would not affect the bit pattern in a region of storage.
As an example:
#include <limits.h>
#include <string.h>
#if LONG_MAX == LLONG_MAX
typedef long long longish;
#elif LONG_MAX == INT_MAX
typedef int longish;
#endif
__attribute((noinline))
long test(long *p, int index, int index2, int index3)
{
if (sizeof (long) != sizeof (longish))
return -1;
p[index] = 1;
((longish*)p)[index2] = 2;
longish temp2 = ((longish*)p)[index3];
p[index3] = 5; // This should modify p[index3] and set its active/effective type
p[index3] = temp2; // Shouldn't (but seems to) reset effective type to longish
long temp3;
memmove(&temp3, p+index3, sizeof (long));
memmove(p+index3, &temp3, sizeof (long));
return p[index];
}
#include <stdio.h>
int main(void)
{
long arr[1] = {0};
long temp = test(arr, 0, 0, 0);
printf("%ld should equal %ld\n", temp, arr[0]);
}
While gcc happens to process this code correctly on 32-bit ARM (even if one uses flag -mcpu=cortex-m3
to avoid the call to memmove
), clang processes it incorrectly on both platforms. Interestingly, while clang makes no attempt to reload p[index], gcc does reload it on both platforms, but the code for test
on x64 is:
test(long*, int, int, int):
movsx rsi, esi
movsx rdx, edx
lea rax, [rdi+rsi*8]
mov QWORD PTR [rax], 1
mov rax, QWORD PTR [rax]
mov QWORD PTR [rdi+rdx*8], 2
ret
This code writes the value 1 to p[index1], then reads p[index1], stores 2 to p[index2], and returns the value just read from p[index1].
It's possible that memmove will scrub active/effective type on all implementations that correctly handle all of the corner cases mandated by the Standards, but it's not necessary on the -fno-strict-aliasing
dialects of clang and gcc, and it's insufficient on the -fstrict-aliasing
dialects processed by those compilers.