Using ellipsis in cuda device function

Question

I am trying to port some C code to a cuda kernel. The code I am porting uses ellipsis prevalently. When I try to use an ellipsis in a device function like below, I get an error saying that ellipsis are not allowed in device functions.

__device__ int add(int a, ...){}

However, cuda supports using printf in both host and device functions and uses ellipsis in their own code as below in common_functions.h.

extern "C"
{
extern _CRTIMP __host__ __device__ __device_builtin__ __cudart_builtin__ int     __cdecl printf(const char*, ...);
extern _CRTIMP __host__ __device__ __device_builtin__ __cudart_builtin__ int     __cdecl fprintf(FILE*, const char*, ...);
extern _CRTIMP __host__ __device__ __cudart_builtin__ void*   __cdecl malloc(size_t) __THROW;
extern _CRTIMP __host__ __device__ __cudart_builtin__ void    __cdecl free(void*) __THROW;

}

Is there a way to use ellipsis in a device function?

I would not like to hard code a max number of parameters and then change all the calls.
I also would not like to code a custom variadic function method.

I also tried creating a PTX file that I could use to replace the ellipsis usage as the ISA PTX documentation appears to have facilities for handling variable parameters (Note that the documentation says it does not support them and then provides a paragraph with supporting functions and examples. Perhaps, there is a typo?). I got a simple PTX file all the way through the process defined below but got stuck on the executable question in the last comment. I plan to read the nvcc compiler document to try and understand that.

How can I call a ptx function from CUDA C?

I am using a GTX660 which I believe is level 3.0 and cuda 5.0 toolkit on Ubuntu 12.04.

Update regarding the "magic" referred to below:

It looks to me like there must be something special happening in the compiler to restrict ellipsis usage and do something special. When I call printf as below:

printf("The result = %i from adding %i numbers.", result, 2);

I was surprised to find this in the ptx:

.extern .func  (.param .b32 func_retval0) vprintf
(
.param .b64 vprintf_param_0,
.param .b64 vprintf_param_1
)

and later

    add.u64     %rd2, %SP, 0;
st.u32  [%SP+0], %r5;
mov.u32     %r6, 2;
st.u32  [%SP+4], %r6;
// Callseq Start 1
{
.reg .b32 temp_param_reg;
.param .b64 param0;
st.param.b64    [param0+0], %rd1;
.param .b64 param1;
st.param.b64    [param1+0], %rd2;
.param .b32 retval0;
call.uni (retval0), 
vprintf, 
(
param0, 
param1
);

It appears to me that the compiler accepts ellipsis for printf but then swaps a call to vprintf and creates a va_list on the fly manually. va_list is a valid type in device functions.

`printf` is magic; `__device__` functions cannot have ellipsis otherwise. You should just overload `add` to have many parameters, or give all of `add`'s parameters default values of `0`. — Jared Hoberock, Feb 25 '14 at 22:57

score 1 · Accepted Answer · answered Feb 26 '14 at 15:06

1

As @JaredHoberock stated (I think he will not mind if I create an answer):

__device__ functions cannot have ellipsis parameters; that is why you are receiving the compiler error message.

The built-in printf function is a special case, and does not indicate general support for ellipsis.

There are some alternatives that could be mentioned, but none that I am aware of allow truly general variable arguments support. For example, as Jared stated you could simply define a number of parameters, some/most of which have default values specified, so they do not need to be passed explicitly.

You could also play games with templating as is done in the cuPrintf sample code to try and simulate variable arguments, but this will also not be arbitrarily extensible, I don't think.

answered Feb 26 '14 at 15:06

Robert Crovella

143,785
11
213
257

It looks like a dead end unless I can figure out a way to substitute PTX in there that supports variadic functions as written in the documentation. Please add a comment to the other question above if you figured out how to create an executable from the output that was described. – jim Feb 26 '14 at 19:11
For me, anyway, the difficulty in the other linked question is importing raw PTX using the cuda runtime API. It's relatively straightforward to use PTX with the driver API, and driver API and runtime API functionality can be mixed in a single program. For example, see the [cuda ptxjit sample](http://docs.nvidia.com/cuda/cuda-samples/index.html#ptx-just-in-time-compilation). I'm not saying I know how to make ellipsis work with PTX, but you seem to have already gone down that path. – Robert Crovella Feb 26 '14 at 19:18
A double answer. Thanks. – jim Feb 26 '14 at 19:36
I was able to test the PTX directly. The built in variadic functions referred to in the documentation appear not to exist. I got the following error in ptxas: ptxas fatal : Unresolved extern function '%va_start' – jim Apr 03 '14 at 14:57

Using ellipsis in cuda device function

1 Answers1