Does LLVM's NVPTX backend (contributed by NVIDIA) have any support for the new Dynamic Parallelism feature found in CUDA5 / Compute Capability 3.5 devices?
Asked
Active
Viewed 368 times
1 Answers
2
I found some information in the CUDA Dynamic Parallelism Programming Guide, in the section titled "Device-Side Launch from PTX." It seems that a function called cudaLaunchDevice() is accessible from PTX and the user just has to declare this function in PTX and then call it:
// When .address_size is 64
.extern .func(.param .b32 func_retval0) cudaLaunchDevice
(
.param .b64 func,
.param .b64 parameterBuffer,
.param .align 4 .b8 gridDimension[12],
.param .align 4 .b8 blockDimension[12],
.param .b32 sharedMemSize,
.param .b64 stream
)
;
So I suppose suppose that the answer is to declare this function in LLVM IR and then just use it. I have not tested this solution.

n00b101
- 265
- 1
- 10