Highest Voted 'cufft' Questions

1

vote

1 answer

CUDA cufft library 2D FFT only the left half plane correct

I am doing 2D FFT on 128 images of size 128 x 128 using CUFFT library. The way I used the library is the following: unsigned int nx = 128; unsigned int ny = 128; unsigned int nz = 128; // Make 2D fft batch plan int n[2] = {nx, ny}; int inembed[] =…

cuda fft cufft

asked Apr 10 '16 at 06:12

Da Teng

551
4
21

1

vote

1 answer

Strategy - CUFFT computing 2D FFT on many images

I am using CUFFT for 2D FFT on 128 images. Each of the image is of size 128 x 128. On MATLAB, doing one 2D FFT takes 0.3 ms, and to do FFT on all 128 images takes pretty much 128 times of that number of ms. Using CUFFT, the execution of the…

image matlab cuda cufft

asked Apr 06 '16 at 01:38

Da Teng

551
4
21

1

vote

0 answers

How do I fix an argument error in an fft function that uses skcuda.cufft?

I want to make a python-wrapped GPU fft function that can compute the transforms of arbitrary sized inputs using scikits-cuda.cufft. (I tried PyFFT which only takes powers of 2) I modeled my skcuda.cufft code from the CUDA code: __host__…

python fft cufft

asked Feb 16 '16 at 11:23

Joshua Santiago

21
1
4

1

vote

1 answer

Applying cuFFT to OpenGL Vertex Buffer Objects

So the cufftComplex type is an array with n structs with an x and a y-field, respectively representing the real and the imaginary parts of each complex number. On the other hand, if I want to create a vertex buffer object in OpenGL with an x- and…

opengl cuda glsl cufft

asked Feb 13 '16 at 20:50

Jan M.

489
2
5
21

1

vote

1 answer

Why cuFFT is "slow" on K40?

I've compared a simple 3D cuFFT program on both a GTX 780 and a Tesla K40 in double precision mode. On the GTX 780 I measured about 85 Gflops, while on the K40 I measured about 160 Gflops. These results baffled me: the GTX 780 ha 166 Gflops of peak…

cuda fft cufft

asked Dec 16 '15 at 10:49

JohnWil

43
4

1

vote

1 answer

cuFFT wrong results only when starting from complex

I was helped before in this answer to realise an in-place transform and it works well but ONLY if I start with real data. If I start with complex data, the results after IFT+FFT are wrong, and this happens only in the in-place version, I have…

cuda cufft

asked Nov 06 '15 at 15:34

JohnWil

43
4

1

vote

1 answer

Wrong results cufft 3D in-place

I write because I'm facing problems with the cufft 3D transform in-place, while I have no problems for the out-of-place version. I tried to follow Robert Crovella's answer here but I'm not obtaining the correct results when I make a FFT+IFT. This is…

cuda cufft

asked Oct 27 '15 at 13:59

JohnWil

43
4

1

vote

1 answer

Why cufftPlanMany() takes too long?

When calling cufftPlanMany() the first time, it takes about 0.7 sec, but all next calls are fast. Any idea how to accelerate the first call of cufftPlanMany()?

cuda gpu cufft

asked Sep 18 '15 at 21:40

Maghraby

11
3

1

vote

1 answer

How to view CUDA library function calls in profiler?

I am using the cuFFT library. How do I modify my code to see the function calls from this library (or any other CUDA library) in the NVIDIA Visual Profiler NVVP? I am using Windows and Visual Studio 2013. Below is my code. I convert my image and…

cuda cufft nvvp

asked Jul 13 '15 at 15:48

user8919

67
2
9

1

vote

1 answer

CUFFT is 1000x slower in VS2013/Cuda7.0 compared to VS2010/Cuda4.2

This simple CUFFT code was run on two IDEs - VS 2013 with Cuda 7.0 VS 2010 with Cuda 4.2 I found that VS 2013 with Cuda 7.0 was a 1000 times slower approximately. The code executed in 0.6 ms in VS 2010, and took 520 ms on VS 2013, both on an…

c++ visual-studio-2010 visual-studio-2013 cuda cufft

asked Jun 23 '15 at 20:32

The Vivandiere

3,059
3
28
50

1

vote

1 answer

CUDA cuFFT Undefined symbols for architecture x86_64

I'm trying to use cuFFT library but when I compile my project I have the error: Undefined symbols for architecture x86_64: "_cufftDestroy" ... "_cufftExecC2C" ... "_cufftPlan1d" ... ld: symbol(s) not found for architecture x86_64 clang: error:…

c++ c macos cuda cufft

asked Jun 12 '15 at 10:39

mary

305
3
12

1

vote

1 answer

CUDA FFT plan reuse across multiple 'overlapped' CUDA Stream launches

I'm in trying to improve the performance of my code using asynchronous memory transfer overlapped with GPU computation. Formerly I had a code where I created an FFT plan, and then make use of it multiple times. In such situation the time invested…

cuda cufft cuda-streams

asked Mar 04 '15 at 13:33

Omar Valerio

43
7

1

vote

0 answers

Compute several FFT with GPU using Python multiprocessing and pyfft: how to avoid GPU memory leak?

I am trying to implement in Python the following pattern for multi-CPU and single-GPU computation using pycuda and pyfft packages. I would like to have several processes (e.g. launched with multiprocessing.Pool()), with each of them able to perform…

python fft python-multiprocessing pycuda cufft

asked Jan 21 '15 at 12:24

mtazzari

451
1
5
14

1

vote

1 answer

How can I get the full fft coefficients by cufft?

I am doing two dimensional fft process by cufft. Processing type is real to complex, so the size of out array is NX * (NY / 2 + 1) which is non redundant. But I need the full coefficients containing the redundant ones. How can i get them all? Thanks…

cuda cufft

asked Nov 11 '14 at 08:02

Wang Wang

115
1
9

1

vote

1 answer

Batched FFTs using cufftPlanMany

I want to perform 441 2D, 32-by-32 FFTs using the batched method provided by the cuFFT library. The parameters of the transform are the following: int n[2] = {32,32}; int inembed[] = {32,32}; int onembed[] =…

cuda batch-processing cufft

asked Apr 09 '14 at 05:03

Teller

175
2
9

Questions tagged [cufft]