2

is it valid to free device allocated memory from a host function? I'm writing some C++ class that should be used on host and device. My constructor and destructor are something like this:

class myClass {
public:
__host__ __device__ myClass() {
#if defined(__CUDA_ARCH__)
  data = (char*)malloc(DATA_SIZE);
#else
  cudaMalloc(&data,DATA_SIZE);
#endif
}

__host__ __device__ ~myClass() {
#if defined(__CUDA_ARCH__)
  free(data);
#else
  cudaFree(data);
#endif
}

private:
  char* data;
}

The above code compiles and i didn't get an error if i construct a class on the device and free it on the host. But this case is not documented in the CUDA developer papers.

Thomas Berger
  • 1,860
  • 13
  • 26
  • I think you have your cases backwards. If `__CUDA_ARCH__` is defined, wouldn't you then want to use the `cuda` functions? – Jonathan Grynspan Sep 29 '12 at 20:25
  • 1
    If `__CUDA_ARCH__` is defined, the code is compiled for the device. On the device, i have to use `malloc` and `free`. Only on the host, there is need to use `cudaMalloc` and `cudaFree` – Thomas Berger Sep 29 '12 at 20:36
  • As long as malloc and cudaMalloc and free and cudaFree are operating on the same heap, I would think it would be ok. – grieve Sep 29 '12 at 21:23
  • 1
    what makes you think that you've created a class on the device and then freed it on the host? The [CUDA C programming guide v4.2](http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/CUDA_C_Programming_Guide.pdf) states: "Memory allocated via malloc() cannot be freed using the runtime (i.e. by calling any of the free memory functions from Sections 3.2.2)." in section B.17, which is being made in reference to device-malloc'ed memory (on the device heap). – Robert Crovella Sep 29 '12 at 22:17
  • also, if you think you're not getting an error on a particular cudaFree call, you should probably be explicity checking for cuda errors after that cudaFree call. Maybe you are, but I don't see it in the sample code you posted. – Robert Crovella Sep 29 '12 at 22:46
  • Thanks @Robert for the hint in the guide. Didn't seen this part (why ever) at my last study. I'm using CUDA5 RC2, so maybe there has changed something, not yet documented? I create the class in a kernel call, and destruct it from host. I check the return code of cudaFree, just stripped down the example code to much. But i think your comment would be the correct answer, may you please post it so i could accept? – Thomas Berger Sep 29 '12 at 23:12
  • Host/device switch by only checking define of `__CUDA_ARCH__` fails in some cases - you should also check if it is not zero - more detailed [here](http://stackoverflow.com/questions/16313434/check-whether-the-code-is-running-on-the-gpu-or-cpu) – avtomaton Dec 03 '14 at 21:09

2 Answers2

2

For both CUDA 4.2 and CUDA 5.0 RC, the CUDA C Programmer's guide mentions in section B.17 that: "Memory allocated via malloc() cannot be freed using the runtime (i.e., by calling any of the free memory functions from Device Memory). " (This particular text taken from the CUDA 5.0 RC document. In the original doc, Device Memory is a hyperlink to section 3.2.2) The CUDA 4.2 document can be found here (it has similar wording.) I'm wondering if: 1.) Things are actually happening the way you think. Depending on how you allocate the variable in the kernel code, I think it could go out of scope at the completion of the kernel, which would implicitly call your (device-side) destructor. 2.) a call to cudaFree with an invalid pointer (perhaps because it is a device pointer, or perhaps because it has already been freed) simply gets ignored. Without your code to play with, it's all just speculation. But if you're checking for errors and not getting one, then it may be getting ignored.

Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
0

Unless NVIDIA has recently lifted this limitation, you must free memory by in-kernel malloc() by calling in-kernel free(). i.e. You cannot call cudaFree() to free memory that was allocated within a kernel with malloc().

It may not be returning an error, but it also may be leaking the memory.

ArchaeaSoftware
  • 4,332
  • 16
  • 21