7

In C++ AMP, kernel functions or lambdas are marked with restrict(amp), which imposes severe restrictions on the allowed subset of C++ (listed here). Does CUDA allow any more freedom on the subset of C or C++ in kernel functions?

Kate Gregory
  • 18,808
  • 8
  • 56
  • 85
Eugene
  • 6,194
  • 1
  • 20
  • 31
  • 1
    May be related to the following question. http://stackoverflow.com/questions/4899425/what-are-the-real-c-language-constructs-supported-by-cuda-device-code – Pavan Yalamanchili Mar 12 '12 at 20:11
  • Good question, though I'm afraid it's not really comparable (perhaps migrate to programmers.SE?): nvcc doesn't support C++11 at all yet, so when talking about lambdas you obviously don't get very far there! On the other hand, AMP has completely different restrictions, starting with that it's microsoft; that (or, perhaps more correctly, the current lack of a non-DirectX-implementation) makes it completely unusable for many e.g. scientific applications. But I suppose you mean only _language_ restrictions? – leftaroundabout Mar 12 '12 at 20:37
  • @leftaroundabout: Yes, I am only talking about _language_ restrictions, and I am fine to stay within C++03. I have mentioned lambdas only because it is the prescribed mechanism for launching kernel code with C++ AMP. – Eugene Mar 12 '12 at 20:59

1 Answers1

18

As of Visual Studio 11 and CUDA 4.1, restrict(amp) functions are more restrictive than CUDA's analogous __device__ functions. Most noticeably, AMP is more restrictive about how pointers can be used. This is a natural consequence of AMP's DirectX11 computational substrate, which disallows pointers in HLSL (graphics shader) code. By constrast, CUDA's lower-level IR is PTX, which is more general purpose than HLSL.

Here's a line by line comparison:

| VS 11 AMP restrict(amp) functions     | CUDA 4.1 sm_2x __device__ functions  |
|------------------------------------------------------------------------------|
|* can only call functions that have    |* can only call functions that have   |
|  the restrict(amp) clause             |  the __device__ decoration           |
|* The function must be inlinable       |* need not be inlined                 |
|* The function can declare only        |* Class types are allowed             |
|  POD variables                        |                                      |
|* Lambda functions cannot              |* Lambdas are not supported, but      |
|  capture by reference and             |  user functors can hold pointers     |
|  cannot capture pointers              |                                      |
|* References and single-indirection    |* References and multiple-indirection |
|  pointers are supported only as       |  pointers are supported              |
|  local variables and function         |                                      |
|* No recursion                         |* Recursion OK                        |
|* No volatile variables                |* Volatile variables OK               |
|* No virtual functions                 |* Virtual functions OK                |
|* No pointers to functions             |* Pointers to functions OK            |
|* No pointers to member functions      |* Pointers to member functions OK     |
|* No pointers in structures            |* Pointers in structures OK           |
|* No pointers to pointers              |* Pointers to pointers OK             |
|* No goto statements                   |* goto statements OK                  |
|* No labeled statements                |* Labeled statements OK               |
|* No try, catch, or throw statements   |* No try, catch, or throw statements  |
|* No global variables                  |* Global __device__ variables OK      |
|* Static variables through tile_static |* Static variables through __shared__ |
|* No dynamic_cast                      |* No dynamic_cast                     |
|* No typeid operator                   |* No typeid operator                  |
|* No asm declarations                  |* asm declarations (inline PTX) OK    |
|* No varargs                           |* No varargs                          |

You can read more about restrict(amp)'s restrictions here. You can read about C++ support in CUDA __device__ functions in Appendix D of the CUDA C Programming Guide.

Jared Hoberock
  • 11,118
  • 3
  • 40
  • 76
  • IIRC there's a discussion here about features that C++ AMP could have enabled but didn't, sometimes based on explicit choices to encourage good practices in parallel computing: http://channel9.msdn.com/Shows/Going+Deep/C-AMP-The-Development-Team-Technical-Roundtable – lightw8 Aug 17 '12 at 22:04