Now I have a large 16K*16K matrix,and the global memory is not enough.How to calculate the two-dimensional FFT of the matrix?
Asked
Active
Viewed 216 times
0
-
In all likelihood you can't. There are out-of-core fft algorithms. I am unaware of any GPU implementations. Certainly none in cufft – talonmies Jan 07 '21 at 04:37
-
1You should be able to do a `float` R2C or C2R in-place transform of that size on a 3GB GPU (using CUFFT). [Here](https://docs.nvidia.com/cuda/cufft/index.html#twod-complex-to-real-transforms) is the framework. I can do a 15Kx15K transform on a 2GB GPU. – Robert Crovella Jan 07 '21 at 05:09
1 Answers
1
Perhaps oversubscribing with unified memory works with cuFFT?
https://developer.nvidia.com/blog/unified-memory-cuda-beginners/
You can also do the FFT for the rows and columns separately and move data in between to and from host memory.
Do you need the full result matrix? How much memory do you have on CPU and on GPU? Are the inputs/outputs complex values? What precision do you need (is 16 bits enough)? Is the computation time-critical? Do you also want to process even larger matrices?

Sebastian
- 1,834
- 2
- 10
- 22