CUDA and C++, how to link cpp and cu files?

Question

//Header file A.h
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
class A
{
   __host__ __device__ void move();
}

//cu file A.cu
#include "A.h"
{
   __host__ __device__ void A::move()
   {
      ...
   }
}

When calling the method defined in the A.cu file from another .cu file, I`m getting the following error:

External calls are not supported (found non-inlined call to ...),

I'm using the sm_10 compile option.

Does this post [CUDA External calls not supported](http://stackoverflow.com/questions/5994005/cuda-external-calls-not-supported) already provide you with useful information? — Vitality, Sep 19 '13 at 08:36
after setting -rdc=true and srm_20 im getting: Undefined reference to '_ZNmoveEv' — Max, Sep 19 '13 at 09:18
You cannot use compilation for compute capability `2.0` with relocatable code when you have a card with compute capability `1.0`... — Vitality, Sep 19 '13 at 09:22
i checked my card info and you are right, my card isnt support compute capability 2.0, thanks — Max, Sep 19 '13 at 09:29
@JackOLantern can you summarize the assistance you gave into an answer? I would upvote it. — Robert Crovella, Sep 19 '13 at 12:20
I presume A.cpp is actually A.cu. The host C++ compiler can't compile \_\_device\_\_ and \_\_host\_\_ decorators, only nvcc understands how to parse those. — talonmies, Sep 19 '13 at 13:44
@talonmies You are right. I have edited the post and fixed the misprint. — Vitality, Sep 19 '13 at 13:51
A device method cannot be defined in a .cpp file unless specific compile behavior override switches are passed to nvcc. The posting is still unclear on this point. — Robert Crovella, Sep 19 '13 at 13:58
this is a cpp file!. if i implement the function in the header it work, so why i cant implement the function in cpp? — Max, Sep 19 '13 at 15:55
nvcc takes (by default) .cpp files and hands them directly off to the host compiler (i.e. gcc, cl.exe, etc.) The host compiler knows nothing about `__device__` and `__host__` and should emit a syntax error when you attempt to compile. I'm simply repeating here the comment that @talonmies has already made above. The easiest solution is to define your device functions in .cu files. — Robert Crovella, Sep 19 '13 at 16:00
i guess my compiler is smarter then yours ;) i left the implementation in the .h file and no syntax error, it even run :), thank you — Max, Sep 19 '13 at 16:08
Sure. A .h file included in a .cu file is fine. A .h file with device code included in a .cop file would ordinarily be a problem. — Robert Crovella, Sep 19 '13 at 20:19

score 3 · Accepted Answer · answered Sep 19 '13 at 13:49

You need separate compilation. Separate compilation requires cards with compute capability at least 2.0 and at least CUDA 5.0.

Quoting the CUDA 5.0 Release Highlights:

All __device__ functions can now be separately compiled and linked using NVCC. This allows creation of closed-source static libraries of __device__ functions and the ability for these libraries to call user-defined __device__ callback functions. The linker support is considered to be a BETA feature in this release.

Separate compilation is described in Chapter 7 of the CUDA COMPILER DRIVER NVCC Reference Guide.

For those interested, there is a very good thread on separate compilation in the NVIDIA forum, see

How to create a static lib for device functions using cuda 5.0?

CUDA and C++, how to link cpp and cu files?

1 Answers1