3

In my python script I have some quite extensive use of fft and ifft. To speed things up with my GTX 1060 6GB I use the cupy library. After running into Out Of Memory problems, I discovered that memory leakage was the cause.

I created the following code to investigate the problem. After calling cupy.fft.fft more additional memory than the size of the output is allocated. When deleting that ouput, only that amount of memory is actually being released and I don't know how to release the extra memory. Is this a bug or am I overseeing something?

import cupy as cp


t = cp.linspace(0, 1, 1000)
print("t      :", cp.get_default_memory_pool().used_bytes()/1024, "kB")

a = cp.sin(4 * t*2*3.1415)

print("t+a    :", cp.get_default_memory_pool().used_bytes()/1024, "kB")

fft = cp.fft.fft(a)

print("fft    :", fft.nbytes/1024, "kB")
print("t+a+fft:", cp.get_default_memory_pool().used_bytes()/1024, "kB")

del fft
cp.get_default_memory_pool().free_all_blocks()
cp.get_default_pinned_memory_pool().free_all_blocks()

print("t+a    :", cp.get_default_memory_pool().used_bytes()/1024, "kB")

del t,a
print("       :", cp.get_default_memory_pool().used_bytes()/1024, "kB")

Output:

t      : 8.0 kB
t+a    : 16.0 kB
fft    : 15.625 kB
t+a+fft: 48.0 kB
t+a    : 32.0 kB
       : 16.0 kB

I am using cupy-cuda101 version 8.1.0

talonmies
  • 70,661
  • 34
  • 192
  • 269
MazzMan
  • 815
  • 1
  • 9
  • 15

1 Answers1

1

Hi @MazzMan I didn't notice your SO question until now. As I replied in your ticket, this is not a bug but rather expected behavior, as starting v8.0 we cache cuFFT plans by default. The plans are tied to some memory as workspace, so unless the plans are deleted/the cache is cleared, there will be some memory hold. You can refer to CuPy's doc on the plan cache here and try disabling the cache, for example.

In your case, you can also run the following lines after your script to confirm the memory is freed after clearing the cache.

>>> cache = cp.fft.config.get_plan_cache()
>>> cache.clear()
>>> print("after clearing cache:", cp.get_default_memory_pool().used_bytes()/1024, "kB")
after clearing cache: 0.0 kB
Leo Fang
  • 773
  • 5
  • 12