In C++ AMP, kernel functions or lambdas are marked with restrict(amp), which imposes severe restrictions on the allowed subset of C++ (listed here). Does CUDA allow any more freedom on the subset of C or C++ in kernel functions?
Asked
Active
Viewed 954 times
7
-
1May be related to the following question. http://stackoverflow.com/questions/4899425/what-are-the-real-c-language-constructs-supported-by-cuda-device-code – Pavan Yalamanchili Mar 12 '12 at 20:11
-
Good question, though I'm afraid it's not really comparable (perhaps migrate to programmers.SE?): nvcc doesn't support C++11 at all yet, so when talking about lambdas you obviously don't get very far there! On the other hand, AMP has completely different restrictions, starting with that it's microsoft; that (or, perhaps more correctly, the current lack of a non-DirectX-implementation) makes it completely unusable for many e.g. scientific applications. But I suppose you mean only _language_ restrictions? – leftaroundabout Mar 12 '12 at 20:37
-
@leftaroundabout: Yes, I am only talking about _language_ restrictions, and I am fine to stay within C++03. I have mentioned lambdas only because it is the prescribed mechanism for launching kernel code with C++ AMP. – Eugene Mar 12 '12 at 20:59
1 Answers
18
As of Visual Studio 11 and CUDA 4.1, restrict(amp)
functions are more restrictive than CUDA's analogous __device__
functions. Most noticeably, AMP is more restrictive about how pointers can be used. This is a natural consequence of AMP's DirectX11 computational substrate, which disallows pointers in HLSL (graphics shader) code. By constrast, CUDA's lower-level IR is PTX, which is more general purpose than HLSL.
Here's a line by line comparison:
| VS 11 AMP restrict(amp) functions | CUDA 4.1 sm_2x __device__ functions |
|------------------------------------------------------------------------------|
|* can only call functions that have |* can only call functions that have |
| the restrict(amp) clause | the __device__ decoration |
|* The function must be inlinable |* need not be inlined |
|* The function can declare only |* Class types are allowed |
| POD variables | |
|* Lambda functions cannot |* Lambdas are not supported, but |
| capture by reference and | user functors can hold pointers |
| cannot capture pointers | |
|* References and single-indirection |* References and multiple-indirection |
| pointers are supported only as | pointers are supported |
| local variables and function | |
|* No recursion |* Recursion OK |
|* No volatile variables |* Volatile variables OK |
|* No virtual functions |* Virtual functions OK |
|* No pointers to functions |* Pointers to functions OK |
|* No pointers to member functions |* Pointers to member functions OK |
|* No pointers in structures |* Pointers in structures OK |
|* No pointers to pointers |* Pointers to pointers OK |
|* No goto statements |* goto statements OK |
|* No labeled statements |* Labeled statements OK |
|* No try, catch, or throw statements |* No try, catch, or throw statements |
|* No global variables |* Global __device__ variables OK |
|* Static variables through tile_static |* Static variables through __shared__ |
|* No dynamic_cast |* No dynamic_cast |
|* No typeid operator |* No typeid operator |
|* No asm declarations |* asm declarations (inline PTX) OK |
|* No varargs |* No varargs |
You can read more about restrict(amp)
's restrictions here. You can read about C++ support in CUDA __device__
functions in Appendix D of the CUDA C Programming Guide.

Jared Hoberock
- 11,118
- 3
- 40
- 76
-
IIRC there's a discussion here about features that C++ AMP could have enabled but didn't, sometimes based on explicit choices to encourage good practices in parallel computing: http://channel9.msdn.com/Shows/Going+Deep/C-AMP-The-Development-Team-Technical-Roundtable – lightw8 Aug 17 '12 at 22:04