1

Does LLVM's NVPTX backend (contributed by NVIDIA) have any support for the new Dynamic Parallelism feature found in CUDA5 / Compute Capability 3.5 devices?

n00b101
  • 265
  • 1
  • 10

1 Answers1

2

I found some information in the CUDA Dynamic Parallelism Programming Guide, in the section titled "Device-Side Launch from PTX." It seems that a function called cudaLaunchDevice() is accessible from PTX and the user just has to declare this function in PTX and then call it:

// When .address_size is 64
.extern .func(.param .b32 func_retval0) cudaLaunchDevice
(
.param .b64 func,
.param .b64 parameterBuffer,
.param .align 4 .b8 gridDimension[12],
.param .align 4 .b8 blockDimension[12],
.param .b32 sharedMemSize,
.param .b64 stream
)
;

So I suppose suppose that the answer is to declare this function in LLVM IR and then just use it. I have not tested this solution.

n00b101
  • 265
  • 1
  • 10