3

I want to ask whether calling to cudaFree after some asynchronous calls is valid? For example

int* dev_a;

// prepare dev_a...

// launch a kernel to process dev_a (asynchronously)

cudaFree(dev_a);

In this case, since kernel launch is asynchronous, when the cudaFree part is reached, the kernel may haven't finish running yet. Then will the cudaFree(dev_a) immediately after it destroy the data?

shaoyl85
  • 1,854
  • 18
  • 30

2 Answers2

3

As per Jared's comment, I am about 99% certain that the CUDA driver free/malloc pair are implemented as blocking calls which will synchronize the context on which they operate before they execute the call.

talonmies
  • 70,661
  • 34
  • 192
  • 269
  • Thank you! How about the "free" function inside the kernel? If I have a kernel launch immediately proceeding it, does this work? – shaoyl85 Jan 18 '14 at 00:24
2

CUDA now provide functions for asynchronous memory management based on streams: cudaMallocAsync, cudaMemcpyAsync, cudaMemcpyAsync.

A short introduction is available here

pixelou
  • 748
  • 6
  • 17