i'm trying to port some code from CPU to GPU that includes some FFTs. So, on CPU code some complex array is transformed using fftw_plan_many_r2r
for both real and imag parts of it separately. Function foo represents R2R transform routine and called twice for each part of complex array.
void foo(vector_double &evg) {
int nx = Dims[0], ny = Dims[1], nz = Dims[2];
const int nxny[] = {ny, nx};
const int n = nx*ny*nz;
const fftw_r2r_kind kinds[] = {FFTW_RODFT00, FFTW_RODFT00};
if (evg.size() != n)
throw std::runtime_error ("*** weird size of evg");
fftw_plan p;
p = fftw_plan_many_r2r(2, nxny, nz,
&evg[0], nxny, 1, nx*ny,
&evg[0], nxny, 1, nx*ny,
kinds, FFTW_ESTIMATE);
// actual FFT
fftw_execute(p);
}
void bar(vector_complex &evg) {
vector_double tmp;
tmp = evg.real();
foo(tmp);
evg.real() = tmp;
tmp = evg.imag();
foo(tmp);
evg.imag() = tmp;
}
So, how can i receive the same results on CUDA since there is no straight conversion from FFTW R2R to cuFFT? P.S. vector_double and vector_complex are Eigen vectors if that helps