cudaMemPrefetchAsync() returns cudaErrorInvalidDevice - why?

Question

Whenever I call cudaMemPrefetchAsync() it returns the error code cudaErrorInvalidDevice. I am sure that I pass right device id (I have only one CUDA-capable GPU in my laptop under id == 0).

I believe that code sample posted below is error-free, but at line 52 (call to cudaMemPrefetchAsync()) I keep getting this error.

I tried:

Clean driver installation. (Latest version)
I check Google for an answer, but I could not find any. (I managed only to find this)

(I haven't idea for anything else)

System Spec:

OS: Microsoft windows 8.1 x64 home.
IDE: Visual studio 2015
CUDA toolkit: 8.0.61
NVIDIA GPU: GeForce GTX 960M
NVIDIA GPU driver: ver 381.65 (latest)
Compute Capability: 5.0 (Maxwell)
Unified Memory support: is supported.
Intel integrated gpu: Intel HD graphics 4600

Code Sample:

/////////////////////////////////////////////////////////////////////////////////////////////////////////
// TEST AREA:
// -- INCLUDE: 
/////////////////////////////////////////////////////////////////////////////////////////////////////////

// Cuda Libs: ( Device Side ):
#include <cuda_runtime.h>
#include <device_launch_parameters.h>

// Std C++ Libs:
#include <iostream>
#include <iomanip>
///////////





/////////////////////////////////////////////////////////////////////////////////////////////////////////
// TEST AREA:
// -- NAMESPACE:
/////////////////////////////////////////////////////////////////////////////////////////////////////////
using namespace std;
///////////





/////////////////////////////////////////////////////////////////////////////////////////////////////////
// TEST AREA:
// -- START POINT:
/////////////////////////////////////////////////////////////////////////////////////////////////////////
int main() {

    // Set cuda Device:
    if (cudaSetDevice(0) != cudaSuccess)
        cout << "ERROR: cudaSetDevice(***)" << endl;

    // Array:
    unsigned int size = 1000;
    double * d_ptr = nullptr;

    // Allocate unified memory:
    if (cudaMallocManaged(&d_ptr, size * sizeof(double), cudaMemAttachGlobal) != cudaSuccess)
        cout << "ERROR: cudaMallocManaged(***)" << endl;

    if (cudaDeviceSynchronize() != cudaSuccess)
        cout << "ERROR: cudaDeviceSynchronize(***)" << endl;

    // Prefetch:
    if(cudaMemPrefetchAsync(d_ptr, size * sizeof(double), 0) != cudaSuccess)
        cout << "ERROR: cudaMemPrefetchAsync(***)" << endl;

    // Exit:
    getchar();
}
///////////

the documentation says the device must have the cudaDevAttrConcurrentManagedAccess attribute be non-zero. Have you checked that? — talonmies, Apr 15 '17 at 19:26
Well I have now and it says that I don't have a supoort for this: `(cudaDeviceProp)devProp.concurrentManagedAccess == 0`. But from what I understand this is a feature that was introduced for `compute capability == 6.0` (Pascal) and `cudaMemPrefetchAsync(***)` was first introduced for devices with `compute capability == 3.0` (Kepler). Therefore I still should be able to use it even though i am just `5.0` — PatrykB, Apr 15 '17 at 19:37
@talonmies Could you be so kind and point me to the documentation page where you find it? — PatrykB, Apr 15 '17 at 19:40
http://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY.html#group__CUDART__MEMORY_1ge8dc9199943d421bc8bc7f473df12e42 — talonmies, Apr 15 '17 at 19:59
Thank you. You were right, I was wrong. My GPU does not support this feature. I can still use unified memory but without `cudaMemPrefetchAsync(***)`. — PatrykB, Apr 15 '17 at 20:03
This api call is for pascal GPUs. You don't have a pascal GPU, and the call would be redundant anyway because [pre-pascal UM will automatically migrate managed data to the GPU](http://stackoverflow.com/questions/39782746/why-is-nvidia-pascal-gpus-slow-on-running-cuda-kernels-when-using-cudamallocmana/40011988#40011988) at kernel launch — Robert Crovella, Apr 15 '17 at 20:19

score 7 · Accepted Answer · edited May 23 '17 at 12:34

7

Thank to talonmies I have realized that my GPU does not support prefetch feature. In order to be able to use cudaMemPrefetchAsync(***) gpu must have non-zero value in (cudaDeviceProp)deviceProp.concurrentManagedAccess.

See more here.

edited May 23 '17 at 12:34

Community

1
1

answered Apr 15 '17 at 20:15

PatrykB

1,579
1
15
24

1

Please remember to come back in a couple of days and accept this answer so your question falls off the unanswered list for the CUDA tag – talonmies Apr 16 '17 at 10:48

cudaMemPrefetchAsync() returns cudaErrorInvalidDevice - why?

I tried:

System Spec:

Code Sample:

1 Answers1