1

I use pyculib to perform 3D FFT on a matrix in Anaconda 3.5. I just followed the example code posted in the website. But I found something interesting and don't understand why.

Performing a 3D FFT on matrix with pyculib is correct only when using numpy.arange to create the matrix.

Here is the code:

from pyculib.fft.binding import Plan, CUFFT_C2C
import numpy as np
from numba import cuda
data = np.random.rand(26, 256, 256).astype(np.complex64)
orig = data.copy()
d_data = cuda.to_device(data)
fftplan = Plan.three(CUFFT_C2C, *data.shape)
fftplan.forward(d_data, d_data)
fftplan.inverse(d_data, d_data)
d_data.copy_to_host(data)
result = data / n
np.allclose(orig, result.real)

Finally, it turns out to be False. And the difference between orig and result is not a small number, not negligible. I tried some other data sets (not random numbers), and get the some wrong results.

Also, I test without inverse FFT:

from pyculib.fft.binding import Plan, CUFFT_C2C
import numpy as np
from numba import cuda
from scipy.fftpack import fftn,ifftn
data = np.random.rand(26,256,256).astype(np.complex64)
orig = data.copy()
orig_fft = fftn(orig)
d_data = cuda.to_device(data)
fftplan = Plan.three(CUFFT_C2C, *data.shape)
fftplan.forward(d_data, d_data)
d_data.copy_to_host(data)
np.allclose(orig_fft, data)

The result is also wrong.

The test code on website, they use numpy.arange to create the matrix. And I tried:

n = 26*256*256
data = np.arange(n, dtype=np.complex64).reshape(26,256,256)

And the FFT result of this matrix is right.

Could anyone help to point out why?

talonmies
  • 70,661
  • 34
  • 192
  • 269
billinair
  • 93
  • 1
  • 11

1 Answers1

1

I don't use CUDA, but I think your problem is numerical in nature. The difference lies in the two data sets you are using. random.rand has dynamic range of 0-1, and arange 0-26*256*256. The FFT attempts to resolve spatial frequency components on the order of range of values / number of points. For arange, this becomes unity, and the FFT is numerically accurate. For rand, this is 1/26*256*256 ~ 5.8e-7.

Just running FFT/IFFT on your numpy arrays without using CUDA shows similar differences.

Jim Parker
  • 1,095
  • 11
  • 16
  • I agree, it could be some numerical problem. Something interesting is that I just tried FFTW3 today, doing the same thing. And pyfftw and numpy.fft turns out to be same with a numpy.rand input. I am wondering if I did something wrong with cufft. – billinair Dec 20 '17 at 03:43