1

I have a CUDA-based code and I want to incorporate OpenACC into some parts of the code. But, the function that I am trying to parallelize by OpenACC code sometimes is governed by CUDA calls and sometimes not.

My question is that how can I query OpenACC library to see whether device is busy or not. Is there any API calls for that?

Note: I am not completely familiar with CUDA, so I just use pseudo-code.

Sometimes the target function seq_function is called on the host when device is busy with computation like below. But, sometimes it is called when device is not busy.

cudaMemAlloc(...);
cudaLaunchAsync(...);
...
//This is the function I am trying to parallelize with OpenACC
seq_function(...); 
...
cudaWait(...);
cudaDealloc(...);

So, I want to make my target function flexible:

  • if device is busy or a CUDA-based computation is running => use host.
  • if device is not busy => use GPU through OpenACC-enabled code.

Is there a way to find whether the device is busy or not?

Ole Tange
  • 31,768
  • 5
  • 86
  • 104
mgNobody
  • 738
  • 7
  • 23

1 Answers1

1

I don't know of a way to programmatically get the device utilization. You can get the memory usage via cudaMemGetInfo which you might be able to use to extrapolate if something is running on the GPU or not.

Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
Mat Colgrove
  • 5,441
  • 1
  • 10
  • 11
  • Thanks Mat. Although it is CUDA- and NVidia-specific and not a general method as OpenACC promises, it seems that it is the only way to find out. – mgNobody Jul 15 '16 at 23:08
  • 1
    PGI does offer a extensions to the OpenACC API, "acc_get_memory" and "acc_get_free_memory", which essentially does the same thing as cudaMemGetInfo. I only suggested cudaMemGetInfo given you were using CUDA elsewhere in your code. – Mat Colgrove Jul 18 '16 at 15:11
  • I am using OpenACC with presence of CUDA (while CUDA is also used). I want to add a feature to previously written CUDA-based code. So, having this extension helps a lot and makes my piece of code to be independent. – mgNobody Jul 19 '16 at 15:28
  • But, the thing is that the value returned as the "free" is different from total value at the beginning (as discussed in [here](http://stackoverflow.com/questions/8684770/how-is-cuda-memory-managed) too. Therefore, there is no way to do so, unless we record free (available) memory from the beginning and compare current free memory with that. – mgNobody Jul 19 '16 at 19:20
  • 1
    If you're willing to use concepts specific to NVIDIA accelerators, the NVML api allows you to query device utilization data directly. Most of what you can retrieve with `nvidia-smi -a` can be retrieved through NVML, the library behind the [GPU deployment kit](https://developer.nvidia.com/gpu-deployment-kit). For example `nvmlDeviceGetUtilizationRates()` – Robert Crovella Jul 20 '16 at 22:14