9

I have a project that requires C++11, so I separate the files into two categories: those that use C++11, and those that use C++03 and hence are compatible with the nvcc compiler. When I have a kernel that is not a template function, it is easy to load the module and find the function name using cuModuleGetDataEx. However, when the kernel is a template, the function name is mangled after explicit specialization. This makes it difficult to obtain a handle to the function after loading the module using the CUDA Driver API. For example, consider this function.

template <class T, class SizeType>
global void
vector_add(const T* a, const T* b, T* c, const SizeType dim)
{
    const SizeType i = blockIdx.x * blockDim.x + threadIdx.x;
    if (i < dim) { c[i] = a[i] + b[i]; }
}

After I compile it into PTX code, the mangled name is _Z10vector_addIfjEvPKT_S2_PS0_T0_. How can I find and load template kernel functions easily from my host code, without manually finding them in the file and copying their names?

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
void-pointer
  • 14,247
  • 11
  • 43
  • 61
  • 3
    I suppose you can create wrapper functions that explicitly instantiate each version of the template that you need and put the types into the function names of the wrapper functions. – Roger Dahl May 03 '12 at 20:04
  • That's true, but then don't I lose the ability to do JIT compilation with the PTX code? I would still need to know the name of the function to retrieve a handle to it after using `cuModuleGetDataEx`. – void-pointer May 03 '12 at 20:09
  • Reply to myself: No, you don't. You can use the `--ptxas-options` flag to specify the same JIT options during the build process. I still wonder if there's a more elegant solution, though. – void-pointer May 03 '12 at 20:13
  • @RogerDahl Yes I know, but the most important reason I am using templates is because I don't know the types over which the functions will be parametrized (and they are not going to be simple primitives). – void-pointer Jul 02 '12 at 17:47

1 Answers1

1

Blockquote I have a project that requires C++11.

It must be a joke, your program do require a prototype compiler ... You did not mention the compiler you are using, but it looks like gcc.

Know your compiler

I'm pretty sure your CUDA part do not require C++11, put everything along side C++03 files and go as usual, using a library if required to links with C++11 proto-compiler-generated-executable, it's state of the art.

Gold
  • 136
  • 6