61

Does CUDA support recursion?

Michael Petrotta
  • 59,888
  • 27
  • 145
  • 179
JuanPablo
  • 23,792
  • 39
  • 118
  • 164

12 Answers12

52

It does on NVIDIA hardware supporting compute capability 2.0 and CUDA 3.1:

New language features added to CUDA C / C++ include:

Support for function pointers and recursion make it easier to port many existing algorithms to Fermi GPUs

http://developer.nvidia.com/object/cuda_3_1_downloads.html

Function pointers: http://developer.download.nvidia.com/compute/cuda/sdk/website/CUDA_Advanced_Topics.html#FunctionPointers

Recursion: I can't find a code sample on NVIDIA's website, but on the forum someone post this:

__device__ int fact(int f)
{
  if (f == 0)
    return 1;
  else
    return f * fact(f - 1);
}
Community
  • 1
  • 1
elmattic
  • 12,046
  • 5
  • 43
  • 79
  • 2
    Having "recent" hardware is not enough. Not all recent cards are Fermi (aka compute capability 2.0). At present, there are no Fermi mobile GPUs. – Mark Borgerding Sep 07 '10 at 14:38
  • You're right, I updated my anwser. But what about Geforce GTX 480M? There's a Fermi chip inside. – elmattic Sep 07 '10 at 15:06
  • There are laptops with Fermi; Geforce 480M and Quadro FX5000M have been out for a little while. – Tom Sep 09 '10 at 08:53
  • Indeed. And new GeForce GT 415M, 420M, 425M, 435M, 445M and GTX 460M, 470M are coming! All have compute capability 2.0. – elmattic Sep 09 '10 at 09:07
  • I don't think these are real recursive calls, since nvcc inlines every function marked __device__ by default. However the result is the same. – jopasserat Apr 11 '11 at 15:36
  • I'd not recommend using recursion on CUDA because the size of stack for each cuda thread is very small, by using recursion , you enlarge your stack for each thread – TripleS May 30 '12 at 15:05
  • @jHackTheRipper: How could it inline the code is f is not known at compile time (it may unroll it but it couldn't inline it). – Maciej Piechotka Sep 07 '12 at 00:07
13

Yes, see the NVIDIA CUDA Programming Guide:

device functions only support recursion in device code compiled for devices of compute capability 2.0.

You need a Fermi card to use them.

Dr. Snoopy
  • 55,122
  • 7
  • 121
  • 140
9

Even though it only supports recursion for specific chips, you can sometimes get away with "emulated" recursion: see how I used compile-time recursion for my CUDA raytracer.

ttsiodras
  • 10,602
  • 6
  • 55
  • 71
7

In CUDA 4.1 release CUDA supports recursion only for __device__ function but not for __global__ function.

talonmies
  • 70,661
  • 34
  • 192
  • 269
username_4567
  • 4,737
  • 12
  • 56
  • 92
5

Only after 2.0 compute capability on compatible devices

Arturo Garcia
  • 51
  • 1
  • 2
3

Sure it does, but it requires the Kepler architecture to do so. Check out their latest example on the classic quick sort.

http://blogs.nvidia.com/2012/09/how-tesla-k20-speeds-up-quicksort-a-familiar-comp-sci-code/

As far as i know, only latest Kepler GK110 supports dynamic parallelism, which allow this kind of recursive call and spawning of new threads within the kernel. Before Kepler GK110, it was not possible. And note that not all Kepler architecture supports this, only GK110 does.

If you need recursion, you probably need the Tesla K20. I'm not sure if Fermi does supports it,never read of it. :\ But Kepler sure does. =)

Hong Zhou
  • 659
  • 1
  • 9
  • 20
3

Any recursive algorithm can be implemented with a stack and a loop. It's way more of a pain, but if you really need recursion, this can work.

dicroce
  • 45,396
  • 28
  • 101
  • 140
2

CUDA 3.1 supports recursion

Jan C
  • 21
  • 1
1

If your algorithm invovles alot of recursions, then support or not, it is not designed for GPUs, either redesign your algorthims or get a better CPU, either way it will be better (I bet in many cases, maginitudes better) then do recurisons on GPUs.

user0002128
  • 2,785
  • 2
  • 23
  • 40
0

Yeah, it is supported on the actual version. But despite the fact it is possible to execute recursive functions, you must have in mind that the memory allocation from the execution stack cannot be predicted (the recursive function must be executed in order to know the true depth of the recursion), so your stack could result being not enough for your purposes and it could need a manual increment of the default stack size

0

Yes, it does support recursion. However, it is not a good idea to do recursion on GPU. Because each thread is going to do it.

  • A citation (docs, etc.) would make this answer more complete. For reference, it looks like this was added [in CUDA 3.1](http://developer.nvidia.com/object/cuda_3_1_downloads.html): "New language features added to CUDA C / C++ include: Support for function pointers and recursion make it easier to port many existing algorithms to Fermi GPUs" – s3cur3 Dec 18 '18 at 19:38
  • That is right. This feature was added to CUDA C in toolkit 3.1. The latest version of CUDA programming guide implicitly indicates that recursive device function is supported. However `__global__` functions do not support recursion. Please refer to F.3.9.6. Function Recursion in programming guide. – palebluedot Dec 19 '18 at 05:28
-2

Tried just now on my pc with a NVIDIA GPU with 1.1 Compute capability. It says recursion not yet supported. So its not got anything to do with the runtime but the hardware itself

sp497
  • 2,363
  • 7
  • 25
  • 43