Questions tagged [cufft]

cuFFT is a FFT library for CUDA enabled GPUs. Capabilities are similar to the FFTW library.

cuFFT is a FFT library for CUDA enabled GPUs. cuFFT provides functions to do various kinds of forward and reverse Fast Fourier Transforms including multidimensional transforms and batched transforms.

146 questions
2
votes
3 answers

CUFFT - padding/initializing question

I am looking at the Nvidia SDK for the convolution FFT example (for large kernels), I know the theory behind fourier transforms and their FFT implementations (the basics at least), but I can't figure out what the following code does: const int …
Marco A.
  • 43,032
  • 26
  • 132
  • 246
2
votes
2 answers

the best way to conduct fft using GPU accelaration with cuda

In python, what is the best to run fft using cuda gpu computation? I am using pyfftw to accelerate the fftn, which is about 5x faster than numpy.fftn. I want to use pycuda to accelerate the fft. I know there is a library called pyculib, but I always…
billinair
  • 93
  • 1
  • 11
2
votes
1 answer

How to: CUDA IFFT

In Matlab when, I enter a one dimensional array of complex numbers, I have an output of arrays with real numbers of same size and same dimension. Trying to repeat this in CUDA C, but have different output. Can you please help? In Matlab, when I…
Talgat
  • 135
  • 11
2
votes
1 answer

Why does cuda-memcheck racecheck report errors with cufft?

The racecheck tool reported memory races with my application. I've isolated it to the CUFFT exec functions. Am I doing something wrong? If not, how can I make racecheck ignore this? Here is a minimal example that when run in cuda-memcheck --tool…
Mark Borgerding
  • 8,117
  • 4
  • 30
  • 51
2
votes
1 answer

cuFFT cannot recover after an error

I cannot find a way to start cuFFT processing after a previous unsuccessful launch. Here is a minimal example. The main idea is as follows: we create a simple cuFTT processor which can manage its resources ( device memory and cuFFT plans). We check…
Grisha Kirilin
  • 282
  • 2
  • 12
2
votes
1 answer

cufftSetStream causes garbage output. Am I doing something wrong?

According to the docs, the cufftSetStream() function Associates a CUDA stream with a cuFFT plan. All kernel launches made during plan execution are now done through the associated stream [...until...] the stream is changed with another call to…
Mark Borgerding
  • 8,117
  • 4
  • 30
  • 51
2
votes
1 answer

How can I tell if cuda code is being compiled with relocatable device code?

In order to use CUFFT callbacks, one of the restrictions is that the code must be compiled with relocatable relocatable device code. When this condition is not met, bad things happen; silent failures, wrong answers, etc. I've got my current build…
Mark Borgerding
  • 8,117
  • 4
  • 30
  • 51
2
votes
2 answers

How to perform a Real to Complex Transformation with cuFFT

The following code has been adapted from here to apply to a single 1D transformation using cufftPlan1d. Ultimately I want to perform a batched in place R2C transformation, but code below perfroms a single transformation using a separate input and…
AlexS
  • 510
  • 2
  • 7
  • 23
2
votes
1 answer

CUFFT_INVALID_VALUE in cufftGetSize1d

What is the proper way to use cufftGetSize1d (or any of the cufftGetSize*) functions? I tried with: cufftHandle plan; size_t workSize; cufftResult result; cufftCreate(&plan); result = cufftGetSize1d(plan, 1000, CUFFT_C2C, 1, &workSize); However,…
user3452579
  • 413
  • 4
  • 14
2
votes
1 answer

CUFFT : How to calculate the fft when the input is a pitched array

I'm trying to find the fft of a dynamically allocated array. The input array is copied from host to device using cudaMemcpy2D. Then the fft is taken (cufftExecR2C) and the results are copied back from device to host. So my initial problem was how…
Optimus
  • 415
  • 4
  • 19
2
votes
2 answers

On plans reuse in cuFFT

This may seem like a simple question but cufft usage is not very clear to me. My question is: which one of the following implementations is correct? 1) // called in a loop cufftPlan3d (plan1, x, y, z) ; cufftexec (plan1, data1) ; cufftexec…
Sagar Masuti
  • 1,271
  • 2
  • 11
  • 30
2
votes
1 answer

running FFTW on GPU vs using CUFFT

I have a basic C++ FFTW implementation that looks like this: for (int i = 0; i < N; i++){ // declare pointers and plan fftw_complex *in, *out; fftw_plan p; // allocate in = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)…
tir38
  • 9,810
  • 10
  • 64
  • 107
1
vote
1 answer

torch fft with a GPU is much slower then fft with CPU

I'm running the following simple code on a strong server with a bunch of Nvidia RTX A5000/6000 with Cuda 11.8. For some reason, FFT with the GPU is much slower than with the CPU (200-800 times). Does anyone have an idea of why that might be? I tried…
MRm
  • 517
  • 2
  • 14
1
vote
0 answers

Fourier transform with cuFFT, are complex to complex more efficient?

I'm writing a code that integrates a PDE in time in Fourier space, and I'm doing so in CUDA/C++. There is one real valued array I need to evolve in time. I've written the code in two different ways, but following the exact same logic. In one version…
MyUserIsThis
  • 417
  • 1
  • 4
  • 17
1
vote
0 answers

2D FFT convolution or 1D?

For my research, I have a lot of different images A, which I want to convolve with kernel B as fast as possible. The images are (M x N) and the kernel (M x P), in the normal convolution (which I have implemented right now) I slide them over each…
Taliebram
  • 91
  • 6
1
2
3
9 10