0

The following question is condensed from a much larger code. Therefore some expressions seem to be an overkill or unnecessary, but are crucial to the original code.

Consider having a struct, which contains compile time constants and a simple container class:

template<typename T> struct CONST
{
    static constexpr T ONE()
    {
         return static_cast<T>( 1 );
    }
};

template<typename T> class Container
{
public:
    using value_type = T;
    T value;
};

Now having a template function, that has a "specialization" for types offering a value_type:

template<typename T> void doSomething( const typename T::value_type& rhs )
{}

Now I would expect, that this should work:

template<typename T> class Tester
{
public:
    static constexpr T ONE = CONST<T>::ONE();

    void test()
    {
        doSomething<Container<T>>( ONE );
    }
};

The interesting point is, that the compiler does not complain about the definition of Tester<T>::ONE, but its usage. Further it does not complain, if I use CONST<T>::ONE() or even static_cast<T>( ONE ) instead of ONE in the function call. However, both should be known at compile time and therefore usable. So my first question is: Does the compiler in the cases, where it works, even do the calculations at compile time?

I checked it with the g++-5, g++-6 and the clang-3.8 compiler using the -std=c++14 flag. They all complain

undefined reference to `Tester<int>::ONE'

although all used features are, as far as I know, in the standard and should therefore be supported. Interestingly the compilation is successful as soon as I add an optimization flag O1, O2 or O3. So my second question is: Is there an strategy of the compiler doing compile time calculations only, if optimization flags are active? I would have expected that at least things, that are declared as compile time constant are always deduced!

The last part of my question covers the NVIDIA nvcc compiler (version 8.0). As I can only pass -std=c++11 to it, it may be that some features are generally not covered. However, using one of the host compiler above, it complains

error: identifier "Tester<int> ::ONE" is undefined in device code

even if an optimization flag is passed! This is obviously the very same problem as above, but while the questions above are more academical (because I can simply use an optimization flag to get rid of the problem), here it is really a problem (concerning the fact that I do not know, what is done at compile time when I use the workarounds mentioned above - and this is also uglier). So my third question is: Is there a way of using the optimizations also in device code?

The following code is an MWE for pure host and also for the nvcc compiler:

#include <iostream>
#include <cstdlib>

#ifdef __CUDACC__
    #define HD __host__ __device__
#else
    #define HD
#endif


template<typename T> struct CONST
{
    HD static constexpr T ONE()
    {
        return static_cast<T>( 1 );
    }
};


template<typename T> class Container
{
public:
    using value_type = T;
    T value;
};


template<typename T> HD void doSomething( const typename T::value_type& rhs ) {}


template<typename T> class Tester
{
public:
    static constexpr T ONE = CONST<T>::ONE();

    HD void test()
    {
        doSomething<Container<T>>( ONE );
        // doSomething<Container<T>>( static_cast<T>( ONE ) );
        // doSomething<Container<T>>( CONST<T>::ONE() );
    }
};


int main()
{
    using t = int;

    Tester<t> tester;
    tester.test();

    return EXIT_SUCCESS;
}

Thanks in advance!

Barry
  • 286,269
  • 29
  • 621
  • 977
marlam
  • 590
  • 5
  • 14
  • regarding the first part of your question see http://stackoverflow.com/questions/8452952/c-linker-error-with-class-static-constexpr – m.s. Sep 22 '16 at 15:07

1 Answers1

4

The difference between this:

doSomething<Container<T>>( ONE );

as opposed to these two:

doSomething<Container<T>>( static_cast<T>( ONE ) );
doSomething<Container<T>>( CONST<T>::ONE() );

is that in the first case you're binding a reference directly to ONE and the others you are not. More specifically, you are odr-using ONE in the first case but not the other two. When you odr-use an entity, it needs a definition, and ONE is currently declared but not defined.

You need to define it:

template<typename T>
class Tester
{
public:
    // declaration
    static constexpr T ONE = CONST<T>::ONE();
    // ..
};

// definition
template <typename T>
constexpr T Tester<T>::ONE;
Barry
  • 286,269
  • 29
  • 621
  • 977
  • Thx. This is definitely the answer for the cpu case. But as long as I am not doing something terribly wrong, it does still not work with the `nvcc` compiler... . The situation should be the same, shouldn't? – marlam Sep 22 '16 at 15:31
  • I just recognize: Why did it work with optimization flags? Are they just guessing right (perhaps potentially also wrong in more complicated cases?) or is something else happening? – marlam Sep 22 '16 at 15:59
  • @marlam Can't help you with nvcc, don't know. It shouldn't just ever work, and could actually be ill-formed NDR like most other odr violations. – Barry Sep 22 '16 at 16:44
  • For the CUDA case, I think this is a real parser problem. The constexpr is not getting correctly propagated into the device compilation trajectory. But the point regarding definition is this answer is perfectly correct. – talonmies Sep 23 '16 at 15:23
  • I state this question as accepted, unless the lack of the nvcc problem. But as it turned out to be different problems (which I did not mention questioning), it will be best to reask the nvcc question with a clearer focus on that particular problem. Thanks again for your fast and nice answer! – marlam Sep 26 '16 at 08:43