Questions tagged [cuda-driver]

A lower-level C-language API for managing computational work in the CUDA platform on NVIDIA GPU hardware.

This tag refers to the CUDA Driver API. It is a lower-level alternative to the much more common CUDA Runtime API. Both are part of the CUDA platform and offer different levels of abstraction when programming general-purpose GPU applications.

The Driver API resembles much of the OpenCL programming style. Unlike the Runtime API, it does not require the use of the nvcc compiler and offers the possibility of runtime compilation by means of the NVRTC library.

Members of the CUDA Driver API are prefixed with cu, while members of the Runtime API are prefixed with cuda. E.g.: cudaGetErrorName (Runtime API) vs cuGetErrorName (Driver API).

NVIDIA's documentation on the difference between the driver and runtime APIs.

Questions about CUDA Driver API can be asked here on Stack Overflow, but if you have bugs to report you should discuss them on the CUDA forums or report them via the registered developer portal. You may want to cross-link to any discussion here on SO.

46 questions
0
votes
1 answer

What should I set the flags field of CUDA_BATCH_MEM_OP_NODE_PARAMS?

The CUDA graph API exposes a function call for adding a "batch memory operations" node to a graph: CUresult cuGraphAddBatchMemOpNode ( CUgraphNode* phGraphNode, CUgraph hGraph, const CUgraphNode* dependencies, size_t numDependencies,…
einpoklum
  • 118,144
  • 57
  • 340
  • 684
0
votes
1 answer

What type should be pointed to for the result of cuDeviceGetGraphMemAttribute()?

cuDeviceGetGraphMemAttribute() takes a void pointer to a result variable. But - what type does it expect the pointed-to value to be? The documentation (for CUDA v12.0) doesn't say. I'm guessing it's an unsigned 64-bit type, but I want to make sure.
einpoklum
  • 118,144
  • 57
  • 340
  • 684
0
votes
1 answer

How can I tell whether a copy-node search failed, or whether my node or graph are invalid?

Consider the CUDA graphs API function cuFindNodeInClone(). The documentation says, that it: Returns: CUDA_SUCCESS, CUDA_ERROR_INVALID_VALUE This seems problematic to me. How can I tell whether the search failed (e.g. because there is no copy of…
einpoklum
  • 118,144
  • 57
  • 340
  • 684
0
votes
1 answer

Does cuMemcpy "care" about the current context?

Suppose I have a GPU and driver version supporting unified addressing; two GPUs, G0 and G1; a buffer allocated in G1 device memory; and that the current context C0 is a context for G0. Under these circumstances, is it legitimate to cuMemcpy() from…
einpoklum
  • 118,144
  • 57
  • 340
  • 684
0
votes
2 answers

CUDA Virtual memory on Windows - what is the handle type?

From the CUDA driver API documentation: enum CUmemAllocationHandleType Flags for specifying particular handle types Values CU_MEM_HANDLE_TYPE_NONE = 0x0 Does not allow any export mechanism. CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR = 0x1 …
einpoklum
  • 118,144
  • 57
  • 340
  • 684
0
votes
2 answers

How do I check, programmatically, which targets are available in a cubin?

Suppose I have a cubin file, or perhaps to make it easier, a cubin file I loaded into memory (so that I have a void* to the data). Using the CUDA Driver API for modules, I can try loading the data into a module within the current context; and this…
einpoklum
  • 118,144
  • 57
  • 340
  • 684
0
votes
1 answer

What is cuEventRecord guaranteed to do if it gets the default-stream's handle?

Suppose I call cuEventRecord(0, my_event_handle). cuEventRecord() requires the stream and the event to belong to the same context. Now, one can interpret the 0 as "the default stream in the appropriate context" - the requirements are satisfied and…
einpoklum
  • 118,144
  • 57
  • 340
  • 684
0
votes
1 answer

Must I keep a virtual address range reservation if it has active mappings?

CUDA's low-level virtual memory management mechanism involves: Physical allocations Virtual address range reservations Mappings between the above Conveniently, if you map a physical allocation to some address range - you can "free" the physical…
einpoklum
  • 118,144
  • 57
  • 340
  • 684
0
votes
2 answers

Why is cuMemAddressReserve() failing with CUDA_INVALID_VALUE?

Consider the following program (written in C syntax): #include #include #include int main() { CUresult result; unsigned int init_flags = 0; result = cuInit(init_flags); if (result != CUDA_SUCCESS) {…
einpoklum
  • 118,144
  • 57
  • 340
  • 684
0
votes
1 answer

How can I get the CUDA driver module handle for functions and globals in the compiled program?

The CUDA Runtime API has the functions cudaGetSymbolAddress() and cudaGetSymbolSize() for working with device-side globals from host-side code, using their names (source-code identifiers) as handles. In the Driver API, we have cuModuleGetGlobal(),…
einpoklum
  • 118,144
  • 57
  • 340
  • 684
0
votes
1 answer

How do the CUDA Runtime's current device and the driver context stack interact?

The CUDA Runtime has a notion of a "current device", while the CUDA Driver does not. Instead, the driver has a stack of context, where the "current context" is at the top of the stack. How do the two interact? That is, how do Driver API calls affect…
einpoklum
  • 118,144
  • 57
  • 340
  • 684
0
votes
1 answer

What are the types of these CUDA pointer attributes?

The cuGetPointerAttribute() is passed a pointer to one of multiple types, filled according to the actual attribute requested. Some of those types are stated explicitly or may be deduced implicitly to deduce, but some - not so much. Specifically...…
einpoklum
  • 118,144
  • 57
  • 340
  • 684
0
votes
2 answers

Why are my 2D array copy parameters being rejected by the driver API?

I'm trying to use the CUDA Driver API to copy data into a 2D array, in the program listed below, but am getting an "invalid value" error when I pass my copy parameters. What value in them is wrong? #include #include #include…
einpoklum
  • 118,144
  • 57
  • 340
  • 684
0
votes
1 answer

Unexpected CUDA_ERROR_INVALID_VALUE from cuLaunchKernel()

I'm trying to launch a kernel using the CUDA driver API. Specifically I'm calling CUresult CUDAAPI cuLaunchKernel( CUfunction f, unsigned int gridDimX, unsigned int gridDimY, unsigned int gridDimZ, unsigned int blockDimX, unsigned int…
einpoklum
  • 118,144
  • 57
  • 340
  • 684
0
votes
1 answer

Do cuDevicePrimaryCtxReset() and cudaDeviceReset() do the same thing?

Reading the CUDA Runtime API and Driver API docs, it seems that the two functions: CUresult cuDevicePrimaryCtxReset ( CUdevice dev ); __host__ ​cudaError_t cudaDeviceReset ( void ); do the same thing (upto having to cudaSetDevice(dev) before the…
einpoklum
  • 118,144
  • 57
  • 340
  • 684