how can I mix cuda driver api with cuda runtime api?

Question

If a context is created and made current via the driver API, subsequent runtime calls will pick up this context instead of creating a new one.
If the runtime is initialized (implicitly as mentioned in CUDA Runtime), cuCtxGetCurrent() can be used to retrieve the context created during initialization. This context can be used by subsequent driver API calls.

I can make 1st point work. I can create context from cuda driver. then I can use cuda runtime functions without call cudaSetDevice(), which implicitly create a new primary context.

However, I want to work via 2nd option. That is initialize the runtime first then do cuCtxGetCurrent() and use it in cuda driver api. This does not work at all. I always raise error saying context has been destroyed or invalid. What did I do wrong?

Here is my example codes:

#define CUDA_DRIVER_API
#include <cuda.h>
#include <cuda_runtime.h>
#include <helper_cuda.h>
#include <iostream>
CUcontext check_current_ctx()
{
    CUcontext context{0};
    unsigned int api_ver;
    checkCudaErrors(cuCtxGetCurrent(&context));
    fprintf(stdout, "current context=%p\n", context);
    checkCudaErrors( cuCtxGetApiVersion(context, &api_ver));
    fprintf(stdout, "current context api version = %d\n", api_ver);
    return context;
}
auto inital_runtime_context()
{
    int current_device = 0;
    int device_count = 0;
    int devices_prohibited = 0;
    CUcontext current_ctx{0};

    cudaDeviceProp deviceProp;
    checkCudaErrors(cudaGetDeviceCount(&device_count));;
    if (device_count == 0) {
        fprintf(stderr, "CUDA error: no devices supporting CUDA.\n");
        exit(EXIT_FAILURE);
    }

    // Find the GPU which is selected by Vulkan
    while (current_device < device_count) {
        cudaGetDeviceProperties(&deviceProp, current_device);
        if ((deviceProp.computeMode != cudaComputeModeProhibited)) {
            checkCudaErrors(cudaSetDevice(current_device));
            checkCudaErrors(cudaGetDeviceProperties(&deviceProp, current_device));
            printf("GPU Device %d: \"%s\" with compute capability %d.%d\n\n",
                current_device, deviceProp.name, deviceProp.major,
                deviceProp.minor);
            CUcontext current_ctx;
            cuCtxGetCurrent(&current_ctx);
            std::cout << "current_ctx=" << current_ctx << "\n";
            return current_device;

        } else {
            devices_prohibited++;
        }

        current_device++;
    }

    if (devices_prohibited == device_count) {
        fprintf(stderr,
            "CUDA error:"
            " No Vulkan-CUDA Interop capable GPU found.\n");
        exit(EXIT_FAILURE);
    }

    return -1;
}
void test_runtime_driver_op()
{
    inital_runtime_context();
    check_current_ctx();

}

It reports:

GPU Device 0: "GeForce RTX ..." with compute capability 7.5

current_ctx=0x6eb220
current context=0x6eb220
CUDA error at ... code=201(CUDA_ERROR_INVALID_CONTEXT) "cuCtxGetApiVersion(context, &api_ver)"

You might need to actually include an API call like `cudafree(0)` to make the runtime API create a context. It is possible that your existing code isn't forcing lazy context creation — talonmies, Feb 09 '20 at 07:37
@talonmies Thanks a lot! This really works. But then the document is wrong? Since from the document, cudaSetDevice() should already create the cuda context. would you please wrap this up as an answer? then I will accept it. — Wang, Feb 09 '20 at 09:22
I don't think the documentation is wrong, but exactly when and how context creation happens in the runtime API has always been a bit ambiguous — talonmies, Feb 09 '20 at 09:26
Some further "light reading" for you: [How do the CUDA Runtime's current device and the driver context stack interact?](https://stackoverflow.com/questions/70189845/how-do-the-cuda-runtimes-current-device-and-the-driver-context-stack-interact?noredirect=1&lq=1) — einpoklum, Dec 01 '21 at 19:09

score 4 · Accepted Answer · answered Feb 09 '20 at 09:30

The reason you are getting an error is that, at least in this case, lazy runtime API context creation has not occurred when you try to bind to a context with the driver API. The canonical way to ensure you get a context created with the runtime has always been

cudaSetDevice(current_device);
cudaFree(0);

The documentation has always been ambiguous on this point, and the semantics seemed to have subtly changed over time, but that invocation has always worked for me.

einpoklum · Answer 2 · 2021-12-06T21:32:11.407

@talonmies answer is correct, in the sense that it will work, but - if you want to rectify things from the driver side, you might try something like:

// ...
CUcontext check_current_ctx()
{
    CUcontext context{0};
    checkCudaErrors( cuCtxGetCurrent(&context) );
    CUdevice device;
    checkCudaErrors( cuCtxGetDevice(&device) );
    CUcontext primary_context;
    checkCudaErrors( cuDevicePrimaryCtxRetain(&primary_context, device) );
    unsigned int flags;
    int active;
    CUresult primary_state_check_result = 
        cuDevicePrimaryCtxGetState(device, unsigned &flags, &active);

    // etc. etc.
}

Now you'll be able to check:

Whether the current context for the current device is the primary one (= the Runtime API's context).
Whether that primary context has been initialized or not (by comparing primary_state_check_result to CUDA_ERROR_DEINITIALIZED and CUDA_ERROR_NOT_INITIALIZED).

and then try to get the API version.

I should also mention that I've written a C++ wrapper layer which covers both the driver and the runtime API, and allows seamless use of both of them; see this branch of the cuda-api-wrappers library. They are not released yet (as of the time of writing), but you are very welcome to give them a spin.

how can I mix cuda driver api with cuda runtime api?

2 Answers2

Linked