0

I am using CUDA for my current project, and need to maintain a CPU and GPU kernel with a single implementation. I can tag a function with

__device__ __host__

but that doesn't allow me to split the code when needed to use device-only features. So, I have come up with the following solution:

template <bool IsOnDevice>
#if IsOnDevice
    __device__
#else
    __host__
#endif
...the rest of the function header

Now, I would like to place this code in a #define to encapsulate this part, such as

//Macro:
#define DEVICE_FUNCTION \
template <bool IsOnDevice> \
#if IsOnDevice \
        __device__ \
#else \
        __host__ \
#endif 

//Example function:
DEVICE_FUNCTION
    ...the rest of the function header

However, this doesn't compile as no other preprocesses can be included in the macro. I also tried

#DEVICE_FUNCTION_true __device__
#DEVICE_FUNCTION_false __host__
#DEVICE_FUNCTION_RESOLVER(flag) DEVICE_FUNCTION_##flag

#DEVICE_FUNCTION \
template <bool IsOnDevice> \
DEVICE_FUNCTION_RESOLVER(IsOnDevice)

With no luck, as the token gets resolved as DEVICE_FUNCTION_IsOnDevice even though IsOnDevice is known at compile time. Is there any way to encapsulate code with #if's in a macro (or anything, really)?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129

1 Answers1

3

You could use __CUDA_ARCH__ predefined macro to distungish whether code should be treated as device code or not. On the host side, the macro is not defined.

Here is an example:

__device__ __host__ void foo()
{
#ifdef __CUDA_ARCH__
    __syncthreads();
#else
    // do something else on host side
#endif
}
Grzegorz Szpetkowski
  • 36,988
  • 6
  • 90
  • 137