If not, what is the standard way to free up cudaMalloc
ed memory when an exception is thrown? (Note that I am unable to use Thrust.)
Asked
Active
Viewed 5,676 times
12

mchen
- 9,808
- 17
- 72
- 125
-
What would be throwing the exception? – talonmies May 12 '13 at 16:29
-
Any function or class member - i.e. many things – mchen May 12 '13 at 16:43
-
I mean, will the host code be throwing exceptions in response to an error from the CUDA runtime, or are these other error conditions not related to CUDA? – talonmies May 12 '13 at 16:52
-
@MiloChen May I ask in which cases do you need to free GPU memory after an exception? I guess it is when the exception does not abort the program, right? When does it occur in your cases? – Vitality May 12 '13 at 17:23
-
1The corresponding free operation for [cudaMalloc()](http://docs.nvidia.com/cuda/cuda-runtime-api/index.html#group__CUDART__MEMORY_1g16a37ee003fcd9374ac8e6a5d4dee29e) is [cudaFree()](http://docs.nvidia.com/cuda/cuda-runtime-api/index.html#group__CUDART__MEMORY_1g02b08ab28cfc28c37976556044fb5335) It's also common to test that the pointer is not NULL before passing it to cudaFree, but not necessary in all cases. – Robert Crovella May 13 '13 at 01:04
-
4You could put the `cudaFree()` calls in a cleaning function that would be called when catching the exceptions in your host code. However, the manner in which you should handle these exceptions really depends on what you are trying to achieve, and we won't be able to help you without some more details. – BenC May 13 '13 at 01:37
2 Answers
15
You can use RAII idiom and put your cudaMalloc()
and cudaFree()
calls to the constructor and destructor of your object respectively.
Once the exception is thrown your destructor will be called which will free the allocated memory.
If you wrap this object into a smart-pointer (or make it behave like a pointer) you will get your CUDA smart-pointer.

Sergey K.
- 24,894
- 13
- 106
- 174
4
You can use this custom cuda::shared_ptr
implementation. As mentioned above, this implementation uses std::shared_ptr
as a wrapper for CUDA device memory.
Usage Example:
std::shared_ptr<T[]> data_host = std::shared_ptr<T[]>(new T[n]);
.
.
.
// In host code:
fun::cuda::shared_ptr<T> data_dev;
data_dev->upload(data_host.get(), n);
// In .cu file:
// data_dev.data() points to device memory which contains data_host;
This repository is indeed a single header file (cudasharedptr.h
), so it will be easy to manipulate it if is necessary for your application.

Mahdi
- 141
- 1
- 4