I am having trouble with high memory usage when performing ffts with scipy's fftpack. Example obtained with the module memory_profiler:
Line # Mem usage Increment Line Contents
================================================
4 50.555 MiB 0.000 MiB @profile
5 def test():
6 127.012 MiB 76.457 MiB a = np.random.random(int(1e7))
7 432.840 MiB 305.828 MiB b = fftpack.fft(a)
8 891.512 MiB 458.672 MiB c = fftpack.ifft(b)
9 585.742 MiB -305.770 MiB del b, c
10 738.629 MiB 152.887 MiB b = fftpack.fft(a)
11 891.512 MiB 152.883 MiB c = fftpack.ifft(b)
12 509.293 MiB -382.219 MiB del a, b, c
13 547.520 MiB 38.227 MiB a = np.random.random(int(5e6))
14 700.410 MiB 152.891 MiB b = fftpack.fft(a)
15 929.738 MiB 229.328 MiB c = fftpack.ifft(b)
16 738.625 MiB -191.113 MiB del a, b, c
17 784.492 MiB 45.867 MiB a = np.random.random(int(6e6))
18 967.961 MiB 183.469 MiB b = fftpack.fft(a)
19 1243.160 MiB 275.199 MiB c = fftpack.ifft(b)
My attempt at understanding what is going on here:
The amount of memory allocated by both fft and ifft on lines 7 and 8 is more than what they need to allocate to return a result. For the call
b = fftpack.fft(a)
, 305 MiB is allocated. The amount of memory needed for theb
array is16 B/value * 1e7 values = 160 MiB
(16 B per value as the code is returningcomplex128
). It seems that fftpack is allocating some type of workspace, and that the workspace is equal in size to the output array (?).On lines 10 and 11 the same procedure is run again, but the memory usage is less this time, and more in line with what I expect. It therefore seems that fftpack is able to reuse the workspace.
On lines 13-15 and 17-19 ffts with different, smaller input sizes are performed. In both of these cases more memory than what is needed is allocated, and memory does not seem to be reused.
The memory usage reported above agrees with what windows task manager reports (to the accuracy I am able to read those graphs). If I write such a script with larger input sizes, I can make my (windows) computer very slow, indicating that it is swapping.
A second example to illustrate the problem of the memory allocated for workspace:
factor = 4.5
a = np.random.random(int(factor * 3e7))
start = time()
b = fftpack.fft(a)
c = fftpack.ifft(b)
end = time()
print("Elapsed: {:.4g}".format(end - start))
del a, b, c
print("Finished first fft")
a = np.random.random(int(factor * 2e7))
start = time()
b = fftpack.fft(a)
c = fftpack.ifft(b)
end = time()
print("Elapsed: {:.4g}".format(end - start))
del a, b, c
print("Finished first fft")
The code prints the following:
Elapsed: 17.62
Finished first fft
Elapsed: 38.41
Finished first fft
Filename: ffttest.py
Notice how the second fft, which has the smaller input size, takes more than twice as long to compute. I noticed that my computer was very slow (likely swapping) during the execution of this script.
Questions:
Is it correct that the fft can be calculated inplace, without the need for extra workspace? If so, why does not fftpack do that?
Is there a problem with fftpack here? Even if it needs extra workspace, why does it not reuse its workspace when the fft is rerun with different input sizes?
EDIT:
Old, but possibly related: https://mail.scipy.org/pipermail/scipy-dev/2012-March/017286.html
Is this the answer? https://github.com/scipy/scipy/issues/5986