The current OpenMP standard says about the declare simd
directive for C/C++:
The use of a declare simd construct on a function enables the creation of SIMD versions of the associated function that can be used to process multiple arguments from a single invocation in a SIMD loop concurrently.
More details are given in the chapter, but there seems to be no restriction there to the type of function the directive can be applied to.
So my question is, can this directive be applied safely to an inline
function?
I'm asking that for two reasons:
- An
inline
function is a rather unusual function, since it is normally inlined directly in the place it was called. So it is likely never compiled as a standalone function and therefore, thedeclare simd
aspect of it is quite redundant with the possiblesimd
directive at the enclosing loop's level. - I have a code with such
inline
declare simd
functions, and sometimes, for some nebulous reasons, GCC complains about their multiple definition at link time (with names mangled with extra characters suggesting that these are vectorised versions). But if I remove thedeclare simd
directive, it compiles and link fine.
So far I hadn't think too much about it, but now I'm puzzled. Is that a bug of mine (ie using declare simd
for inline
functions) or is that a problem in GCC generating binary vectorised versions of inline
functions and failing to sort them out at link time?
EDIT:
There is a GCC compiler options which makes a difference. When the inlining is enabled (with -O3
for example), the code compiles and links fine. But when compiled with -O0
or with -O3 -fno-inline
, the inlining is disabled and the linking fails with this "multiple definition of" the function decorated with the omp declare simd
directive.
EDIT 2:
Thanks to @Zboson questions regarding the compiler flags, I managed to create a reproducer. Here it is:
foobar.h:
#ifndef FOOBAR_H_
#define FOOBAR_H_
#include <cmath>
#pragma omp declare simd
inline double foo( double d ) {
return sin( cos( exp( d ) ) );
}
double bar( double *v, int len );
#endif
foobar.cc:
#include "foobar.h"
double bar( double *v, int len ) {
double sum = 0;
for ( int i = 0; i < len; i++ ) {
sum += foo( v[i] );
}
return sum;
}
simd.cc:
#include <iostream>
#include "foobar.h"
int main() {
const int len = 100;
double *v = new double[len];
for ( int i = 0; i < len; i++ ) {
v[i] = i;
}
double sum = 0;
#pragma omp simd reduction( +: sum )
for ( int i = 0; i < len; i++ ) {
sum += foo( v[i] );
}
std::cout << sum << " " << bar( v, len ) << std::endl;
delete[] v;
return 0;
}
compilation:
> g++ -fopenmp -g simd.cc foobar.cc
/tmp/ccI4e7ip.o: In function `_ZGVbN2v__Z3food':
foobar.h:7: multiple definition of `_ZGVbN2v__Z3food'
/tmp/cc4U8Qyu.o:foobar.h:7: first defined here
/tmp/ccI4e7ip.o: In function `_ZGVbM2v__Z3food':
foobar.h:7: multiple definition of `_ZGVbM2v__Z3food'
/tmp/cc4U8Qyu.o:foobar.h:7: first defined here
/tmp/ccI4e7ip.o: In function `_ZGVcN4v__Z3food':
foobar.h:7: multiple definition of `_ZGVcN4v__Z3food'
/tmp/cc4U8Qyu.o:foobar.h:7: first defined here
/tmp/ccI4e7ip.o: In function `_ZGVcM4v__Z3food':
foobar.h:7: multiple definition of `_ZGVcM4v__Z3food'
foobar.h:7: first defined here
/tmp/ccI4e7ip.o: In function `_ZGVdN4v__Z3food':
foobar.h:7: multiple definition of `_ZGVdN4v__Z3food'
foobar.h:7: first defined here
/tmp/ccI4e7ip.o: In function `_ZGVdM4v__Z3food':
foobar.h:7: multiple definition of `_ZGVdM4v__Z3food'
foobar.h:7: first defined here
collect2: error: ld returned 1 exit status
> c++filt _ZGVdM4v__Z3food
_ZGVdM4v__Z3food
> c++filt _Z3food
foo(double)
Gcc versions 4.9.2 and 5.1.0 both give the very same problem, while the Intel compiler version 15.0.3 compiles it just fine.
Final edit:
Hristo Iliev's comment and Z boson's question comfort me in the idea that my code is OpenMP compliant, and that this is a bug in GCC. I'll see to make further tests with the most up-to-date version I can find, and report it if needed.