CUDA cufft library 2D FFT only the left half plane correct

Question

I am doing 2D FFT on 128 images of size 128 x 128 using CUFFT library. The way I used the library is the following:

unsigned int nx = 128; unsigned int ny = 128; unsigned int nz = 128;
// Make 2D fft batch plan
int n[2] = {nx, ny};
int inembed[] = {nx, ny};
int onembed[] = {nx, ny};

cufftPlanMany(&plan,
            2, // rank
            n, // dimension
            inembed,
            1, // istride
            nx * ny, // idist
            onembed,
            1, //ostride
            nx * ny, // odist
            CUFFT_D2Z,
            nz);
cufftSetCompatibilityMode(plan,CUFFT_COMPATIBILITY_NATIVE)

// Create output array
complex<double>* out_complex = new complex<double>[nx * ny * nz];
// Initialize output array
for (unsigned int i = 0; i < nx * ny * nz; i++) {
   out_complex[i].real(0);
   out_complex[i].imag(0);
}
cudaMalloc( (void**)&idata, sizeof(cufftDoubleReal) * nx * ny * nz );
cudaMalloc( (void**)&odata, sizeof(cufftDoubleComplex) * nx * ny * nz );
cudaMemcpy( idata, in_real, nx * ny  * nz * sizeof(cufftDoubleReal), 
                                  cudaMemcpyHostToDevice )  );
cudaMemcpy( odata, out_complex, nx * ny * nz *  sizeof(cufftDoubleComplex), 
                                  cudaMemcpyHostToDevice )  );

cufftExecD2Z( plan, idata, odata );

cudaMemcpy( out_complex, odata, nx * ny * nz * sizeof(cufftDoubleComplex),
                                  cudaMemcpyDeviceToHost ) );

The input in_real on the host is a big array holding the 3D images, which is a double array. I guess there should be no problem converting to/from double from/to cufftDoubleReal and complex from/to cufftDoubleComplex? I am a little suspicious about the way the plan was made and the parameters, which I tried to find some example online but they are not that helpful nor consistent. Then I just set the parameters via the programming guide using my own understanding.

As indicate by the title, the output is partially correct (the left half plane), with the right half plane zeros, which makes me so confused. I tried to set different types of compatibility mode but it was not that helpful. The version that I am comparing to is the MATLAB fft2().

talonmies · Accepted Answer · 2016-04-10T17:46:16.960

4

You need to (re)read the documentation for real to complex transforms. Quoting:

In many practical applications the input vector is real-valued. It can be easily shown that in this case the output satisfies Hermitian symmetry ( X k = X N − k ∗ , where the star denotes complex conjugation). The converse is also true: for complex-Hermitian input the inverse transform will be purely real-valued. cuFFT takes advantage of this redundancy and works only on the first half of the Hermitian vector

i.e. the output of a real to complex transform is symmetrical, and cuFFT exploits this by not calculating the redundant (symmetrical) coefficients. Thus, it is normal to only get "half" the output of transform, because the other "half" is identical. This is not unique to cuFFT, FFTW and most other high performance FFT libraries work this way for real to complex transforms and complex to real inverse transforms.

edited Apr 10 '16 at 17:46

answered Apr 10 '16 at 10:24

talonmies

70,661
34
192
269

Thank you @talonmies. I guess then I have a question not really about CUDA then. Suppose I transformed im1 and get im1_fft, and I have another image in Fourier domain, not symmetric. Then I dot product this two images and get a product image, which has half of it to be zeros due to the input. Then how will CUDA IFFT retain the half plane pixels information that have been zero out (not symmetric in this case)? Or do I, in this case, use full Z2Z transform for both images in CUDA? That sounds like CUDA Z2Z and D2Z operation doesn't really compatible with each other? – Da Teng Apr 11 '16 at 06:21
Yes, if you have an arbitrary array in the frequency domain then you should use C2C (or Z2Z for double precision) transforms; the C2R and R2C transforms assume Hermitian symmetry in the frequency domain, so if you pass in an input that lacks that symmetry then you'll get erroneous results – Michael Jan 25 '19 at 00:44

CUDA cufft library 2D FFT only the left half plane correct

1 Answers1