Is there a CUDA smart pointer?

Question

If not, what is the standard way to free up cudaMalloced memory when an exception is thrown? (Note that I am unable to use Thrust.)

I mean, will the host code be throwing exceptions in response to an error from the CUDA runtime, or are these other error conditions not related to CUDA? — talonmies, May 12 '13 at 16:52
@MiloChen May I ask in which cases do you need to free GPU memory after an exception? I guess it is when the exception does not abort the program, right? When does it occur in your cases? — Vitality, May 12 '13 at 17:23
The corresponding free operation for [cudaMalloc()](http://docs.nvidia.com/cuda/cuda-runtime-api/index.html#group__CUDART__MEMORY_1g16a37ee003fcd9374ac8e6a5d4dee29e) is [cudaFree()](http://docs.nvidia.com/cuda/cuda-runtime-api/index.html#group__CUDART__MEMORY_1g02b08ab28cfc28c37976556044fb5335) It's also common to test that the pointer is not NULL before passing it to cudaFree, but not necessary in all cases. — Robert Crovella, May 13 '13 at 01:04
You could put the `cudaFree()` calls in a cleaning function that would be called when catching the exceptions in your host code. However, the manner in which you should handle these exceptions really depends on what you are trying to achieve, and we won't be able to help you without some more details. — BenC, May 13 '13 at 01:37

score 15 · Accepted Answer · answered Aug 21 '13 at 12:42

You can use RAII idiom and put your cudaMalloc() and cudaFree() calls to the constructor and destructor of your object respectively.

Once the exception is thrown your destructor will be called which will free the allocated memory.

If you wrap this object into a smart-pointer (or make it behave like a pointer) you will get your CUDA smart-pointer.

Mahdi · Answer 2 · 2022-01-21T01:17:47.000

You can use this custom cuda::shared_ptr implementation. As mentioned above, this implementation uses std::shared_ptr as a wrapper for CUDA device memory.

Usage Example:

std::shared_ptr<T[]> data_host =  std::shared_ptr<T[]>(new T[n]);
.
.
.

// In host code:
fun::cuda::shared_ptr<T> data_dev;
data_dev->upload(data_host.get(), n);
// In .cu file:
// data_dev.data() points to device memory which contains data_host;

This repository is indeed a single header file (cudasharedptr.h), so it will be easy to manipulate it if is necessary for your application.

Is there a CUDA smart pointer?

2 Answers2

Usage Example:

Linked