How big is a cudaStream_t?

Question

I have inherited some code that basically does stuff like this:

void *stream;
cudaStreamCreate((cudaStream_t *)&stream);

Looking at targets/x86_64-linux/driver_types.h for CUDA 8, I see:

typedef __device_builtin__ struct CUStream_st *cudaStream_t;

As far as I understand it, the cast will work, but I worry about how future-proof this may be, and also if it is safe when the code is ported to ARM. How dangerous is the above code? Does the __device_builtin__ affect anything?

(Note: I plan to talk to the developer directly and tell them to use cudaStream_t throughout and #include <cuda_runtime.h>, so I am hoping to clarify the technical issues here.)

I don't see a problem with that code. It's kind of the same as `void * mem = malloc(5*sizeof(int));`. I would say it is actually rather safe because dereferencing a pointer to `void` will trigger a warning. The problem I see is verbosity because you now have to litter your code with casts to `cudaStream_t*`. — Henri Menke, Mar 05 '18 at 03:26
The cast is guaranteed to work by the standard: http://eel.is/c++draft/basic.compound#5 — Henri Menke, Mar 05 '18 at 03:30
@HenriMenke Does the `__device_builtin__` attribute affect things, though? — Ken Y-N, Mar 05 '18 at 03:31
That is just an attribute which influences code generation for `ptxas` (just like `__host__` and `__device__` for functions). — Henri Menke, Mar 05 '18 at 03:32
This of course depends on CUDA definining `cudaStream_t` to be a pointer type. If that should change, the code could break. It might be that CUDA provides a guarantee that `cudaStream_t` will always be a pointer type, but I couldn't put my finger on such a guarantee. I'm unsure why someone would choose this coding approach over `cudaStream_t stream;` — Robert Crovella, Mar 05 '18 at 04:28
@RobertCrovella "I'm unsure why someone would choose this coding approach" I often ask myself this with much of the code I get passed! In this case, I think the handles are cached in non-CUDA code, so perhaps they didn't want to pollute the header file, but then again there are already massive amounts of `#ifdef USE_CUDA`s. — Ken Y-N, Mar 05 '18 at 04:33

score 3 · Answer 1 · answered Mar 06 '18 at 15:46

How big is a cudaStream_t?

Like you've observed,

typedef __device_builtin__ struct CUStream_st *cudaStream_t;

So it's a pointer, and has the size of a pointer, i.e. 64 bits on typical architectures today but different sizes on other architectures. But do you really need to utilize that information? I would guess not.

As far as I understand it, the cast will work, but I worry about how future-proof this may be

Then make it:

cudaStream_t stream;
cudaStreamCreate(&stream);

or use the C++'ish API wrappers, e.g.:

auto device = cuda::device::current::get();
auto stream = device.create_stream(cuda::stream::sync);

where that's abstracted away, and stream_t is a wrapper, not a pointer, anyway (caveat: I'm the author of the wrapper library.)

What I'd worry about is not incompatibility, but rather avoiding invalid assumptions. And, indeed, you should not assume cudaStream_t is a pointer - just treat it as something opaque.

and also if it is safe when the code is ported to ARM. How dangerous is the above code?

It's dangerous, but not because of the porting, but like I said, because of the invalid assumption. It would be less dangerous with, say,

static_assert(sizeof(void*) == sizeof(cudaStream_t),
    "Unexpected size of cudaStream_t - not the same as void *");

but why are you insisting on the void *, really?

How big is a cudaStream_t?

1 Answers1