I'm developing some parallel C++ simulation code which I want to vectorise as effectively as possible. This is why I use both template parameters and OpenMP SIMD directives:
- Template parameters are here to resolve some of the conditions that could occur inside the most critical loops, by resolving them at compilation time and removing the corresponding branching altogether.
- OpenMP SIMD directives force the compiler to generate vectorised code.
A (stupid) example of what I mean could be as follow:
template< bool checkNeeded >
int ratio( double *res, double *num, double *denom, int n ) {
#pragma omp simd
for ( int i = 0; i < n; i++ ) {
if ( checkNeeded ) { // dead code removed by the compiler when template is false
if ( denom == 0 ) {
std::cout << "Houston, we've got a problem\n";
return i;
}
}
res[i] = num[i] / denom[i];
}
return n;
}
Globally, it works great but the trouble I have with that is that in the (very rare) cases where I want to use the ratio<true>()
version of the code, this one has been vectorised by the compiler because of the #pragma omp simd
directive, which, due to the tests, printing and early exits from the loop, is way slower than the non-vectorised version...
So what I'd need would be adding an if
clause to my simd
directive, instructing the compiler when to obey to the directive. That would give something like this:
#pragma omp simd if( checkNeeded == false )
Unfortunately, although such if
clauses are supported for numerous OpenMP directives, it is not for the simd
one... I don't think my request is completely stupid so I wonder why is it so, and whether it is likely to be supported in the future.
Anybody knows about that?