Dearth of CUDA 5 Dynamic Parallelism Examples

Question

I've been googling around and have only been able to find a trivial example of the new dynamic parallelism in Compute Capability 3.0 in one of their Tech Briefs linked from here. I'm aware that the HPC-specific cards probably won't be available until this time next year (after the nat'l labs get theirs). And yes, I realize that the simple example they gave is enough to get you going, but the more the merrier.

Are there other examples I've missed?

To save you the trouble, here is the entire example given in the tech brief:

__global__ ChildKernel(void* data){
    //Operate on data
}
__global__ ParentKernel(void *data){
    ChildKernel<<<16, 1>>>(data);
}
// In Host Code
ParentKernel<<<256, 64>>(data);

// Recursion is also supported
__global__ RecursiveKernel(void* data){
    if(continueRecursion == true)
        RecursiveKernel<<<64, 16>>>(data);
}

EDIT: The GTC talk New Features In the CUDA Programming Model focused mostly on the new Dynamic Parallelism in CUDA 5. The link has the video and slides. Still only toy examples, but a lot more detail than the tech brief above.

What exactly is your question? Maybe I'm missing something here. What is it you want us to answer? — Bart, Jun 01 '12 at 16:48
Is "Are there other examples I've missed?" not an appropriate question? If you don't think this is the appropriate forum for my question then just downvote it. — maxywb, Jun 01 '12 at 16:54
At the GTC, one of the demos they showed for Kepler was a simulation of galaxies colliding. I think they were using dynamic parallelism there. Maybe the code for that is available somewhere. — Roger Dahl, Jun 01 '12 at 17:53

score 6 · Accepted Answer · answered Nov 02 '12 at 04:05

6

Here is what you need, the Dynamic parallelism programming guide. Full of details and examples: http://docs.nvidia.com/cuda/pdf/CUDA_Dynamic_Parallelism_Programming_Guide.pdf

answered Nov 02 '12 at 04:05

W.Sun

868
2
11
19

score 5 · Answer 2 · answered Dec 02 '12 at 12:11

Just to confirm that dynamic parallelism is only supported on GPU's with a compute capability of 3.5 upwards.

I have a 3.0 GPU with cuda 5.0 installed I have compiled the Dynamic Parallelism examples nvcc -arch=sm_30 test.cu

and received the below compile error test.cu(10): error: calling a global function("child_launch") from a global function("parent_launch") is only allowed on the compute_35 architecture or above.

GPU info

Device 0: "GeForce GT 640" CUDA Driver Version / Runtime Version 5.0 / 5.0 CUDA Capability Major/Minor version number: 3.0

hope this helps

score 4 · Answer 3 · answered Jun 03 '12 at 23:36

4

I edited the question title to "...CUDA 5...", since Dynamic Parallelism is new in CUDA 5, not CUDA 4. We don't have any public examples available yet, because we don't have public hardware available that can run them. CUDA 5.0 will support dynamic parallelism but only on Compute Capability 3.5 and later (GK110, for example). These will be available later in the year.

We will release some examples with a CUDA 5 release candidate closer to the time the hardware is available.

answered Jun 03 '12 at 23:36

harrism

26,505
2
57
88

Sould we think the new kernel call inside the parent kernel call as an absolutely new kernel call (as like being called from the host)? Namely, we have some issues related to shared memory, preventing divergence etc. It will be interesting to read some more about it, especially which thread may call another kernel, its costs etc. Cool. – phoad Dec 18 '12 at 09:34

score 0 · Answer 4 · answered Jun 02 '12 at 14:04

0

I think compute capability 3.0 doesn´t include dynamic paralelism. It will be included in the GK110 architecture (aka "Big Kepler"), I don´t know what compute capability number will have assigned (3.1? maybe). Those cards won´t be available until late this year (I´m waiting sooo much for those). As far as I know the 3.0 corresponds to the GK104 chips like the GTX690 o the GT640M for laptops.

answered Jun 02 '12 at 14:04

Vicentito

11
1

1

There is no 3.1 compute capability! 3.5 is a real capability of GK110. – geek Jun 03 '12 at 07:40

score 0 · Answer 5 · answered Oct 10 '12 at 23:32

0

Just wanted to check in with you all given that the CUDA 5 RC was released recently. I looked in the SDK examples and wasn't able to find any dynamic parallelism there. Someone correct me if I'm wrong. I searched for kernel launches within kernels by grepping for "<<<" and found nothing.

answered Oct 10 '12 at 23:32

imagineerThat

5,293
7
42
78

I've still only been able to find the tech brief. From my understanding of the release schedule, Nat'l labs will get first crack at Big Kepler cards and the rest of us will get them sometime early next year. Although that still doesn't explain the lack of examples. – maxywb Oct 15 '12 at 15:24

Dearth of CUDA 5 Dynamic Parallelism Examples

5 Answers5

Linked