With GCC 9.4.0, the code
static inline double k(double const *restrict a) {
return a[-1] + a[0] + a[1];
}
static void f(double const *restrict a, double *restrict b, int n) {
for (int i = 0; i < n; ++i) {
b[i] = k(a);
}
}
static inline void swap(double **a, double **b) {
double *tmp = *a;
*a = *b;
*b = tmp;
}
void g(double * a, double * b, int n, int nt) {
for (int t = 0; t < nt; ++t) {
f(a, b, n);
swap(&a, &b);
}
}
results in loop versioned for vectorization because of possible aliasing
for the loop in f
when compiled using gcc -march=native -Ofast -fopt-info-vec
, suggesting that the restrict
qualifiers in f
are being ignored.
If I instead manually inline k
into f
:
static void f(double const *restrict a, double *restrict b, int n) {
for (int i = 0; i < n; ++i) {
b[i] = a[-1] + a[0] + a[1];
}
}
the loop is vectorized without this versioning.
In both cases GCC reports that all of the functions are automatically inlined into each other.
The same occurs if I instead use __restrict
and compile with g++
.
Why does this happen? Is there a cleanly portable (using standard language features) way of causing GCC to respect the restrict
qualifiers in the first case so that I can benefit from vectorization without versioning, without separating f
and g
into different translation units?