The following question is condensed from a much larger code. Therefore some expressions seem to be an overkill or unnecessary, but are crucial to the original code.
Consider having a struct, which contains compile time constants and a simple container class:
template<typename T> struct CONST
{
static constexpr T ONE()
{
return static_cast<T>( 1 );
}
};
template<typename T> class Container
{
public:
using value_type = T;
T value;
};
Now having a template function, that has a "specialization" for types offering a value_type
:
template<typename T> void doSomething( const typename T::value_type& rhs )
{}
Now I would expect, that this should work:
template<typename T> class Tester
{
public:
static constexpr T ONE = CONST<T>::ONE();
void test()
{
doSomething<Container<T>>( ONE );
}
};
The interesting point is, that the compiler does not complain about the definition of Tester<T>::ONE
, but its usage. Further it does not complain, if I use CONST<T>::ONE()
or even static_cast<T>( ONE )
instead of ONE
in the function call. However, both should be known at compile time and therefore usable.
So my first question is: Does the compiler in the cases, where it works, even do the calculations at compile time?
I checked it with the g++-5
, g++-6
and the clang-3.8
compiler using the -std=c++14
flag. They all complain
undefined reference to `Tester<int>::ONE'
although all used features are, as far as I know, in the standard and should therefore be supported. Interestingly the compilation is successful as soon as I add an optimization flag O1
, O2
or O3
. So my second question is: Is there an strategy of the compiler doing compile time calculations only, if optimization flags are active? I would have expected that at least things, that are declared as compile time constant are always deduced!
The last part of my question covers the NVIDIA nvcc
compiler (version 8.0). As I can only pass -std=c++11
to it, it may be that some features are generally not covered. However, using one of the host compiler above, it complains
error: identifier "Tester<int> ::ONE" is undefined in device code
even if an optimization flag is passed! This is obviously the very same problem as above, but while the questions above are more academical (because I can simply use an optimization flag to get rid of the problem), here it is really a problem (concerning the fact that I do not know, what is done at compile time when I use the workarounds mentioned above - and this is also uglier). So my third question is: Is there a way of using the optimizations also in device code?
The following code is an MWE for pure host and also for the nvcc compiler:
#include <iostream>
#include <cstdlib>
#ifdef __CUDACC__
#define HD __host__ __device__
#else
#define HD
#endif
template<typename T> struct CONST
{
HD static constexpr T ONE()
{
return static_cast<T>( 1 );
}
};
template<typename T> class Container
{
public:
using value_type = T;
T value;
};
template<typename T> HD void doSomething( const typename T::value_type& rhs ) {}
template<typename T> class Tester
{
public:
static constexpr T ONE = CONST<T>::ONE();
HD void test()
{
doSomething<Container<T>>( ONE );
// doSomething<Container<T>>( static_cast<T>( ONE ) );
// doSomething<Container<T>>( CONST<T>::ONE() );
}
};
int main()
{
using t = int;
Tester<t> tester;
tester.test();
return EXIT_SUCCESS;
}
Thanks in advance!