I know that "why is my compiler doing this" aren't the best type of questions, but this one is really bizarre to me and I'm thoroughly confused.
I had thought that std::min()
was the same as the handwritten ternary (with maybe some compile time template stuff), and it seems to compile down into the same operation when used normally. However, when trying to make a "min and sum" loop autovectorize they don't seem to be the same, and I would love if someone could help me figure out why. Here is a small example code that produces the issue:
#pragma GCC target ("avx2")
#pragma GCC optimize ("O3")
#include <cstdio>
#include <cstdlib>
#include <algorithm>
#define N (1<<20)
char a[N], b[N];
int main() {
for (int i=0; i<N; ++i) {
a[i] = rand()%100;
b[i] = rand()%100;
}
int ans = 0;
#pragma GCC ivdep
for (int i=0; i<N; ++i) {
//ans += std::min(a[i], b[i]);
ans += a[i]>b[i] ? a[i] : b[i];
}
printf("%d\n", ans);
}
I compile this on gcc 9.3.0
, with the compilation command g++ -o test test.cpp -ftree-vectorize -fopt-info-vec-missed -fopt-info-vec-optimized -funsafe-math-optimizations
.
And the code above as is debugs during compilation as:
test.cpp:19:17: optimized: loop vectorized using 32 byte vectors
In contrast, if I comment the ternary and uncomment the std::min
, I get this:
test.cpp:19:17: missed: couldn't vectorize loop
test.cpp:20:35: missed: statement clobbers memory: _9 = std::min<char> (_8, _7);
So std::min()
seems to be doing something unusual that prevents gcc from understanding that it is just a min operation. Is this something that is caused by the standard? Or is it an implementation failure? Or is there some compile flag that would make this work?