CUDA convolutionFFT2D example - I can't understand it

Question

I studied the Cooley Tukey algorithm and I understood it. I got everything in the CUDA convolutionFFT2D example till these kernels:

spProcess2D calls -> spProcess2D_kernel which calls a lot of -> spPostprocessC2C, mulAndScale and spPreprocessC2C

Here's the complete code: http://nopaste.info/30c13e44fe.html (convolutionFFT2D.cu, here is the spProcess2D function) http://nopaste.info/78d22afac2.html (convolutionFFT2D.cuh, here are the other functions)

I already read all the nvidia sdk papers but I can't still figure out what these function do (they use twiddles, but nothing seems like a Cooley Tukey algorithm there)

Please help me if you can, or at least point me out where to solve my problem

Update: I found this link: http://cnx.org/content/m16336/latest/#uid38 Maybe these functions are performing a breadth-first algorithm? I still can't say that but the shape seems the same

I have no idea what the spProcess2D, spPostprocessC2C and spPreprocessC2C functions do. I accept suggestions or opinions too since I can't figure out nothing — Marco A., May 17 '11 at 07:29

score 1 · Accepted Answer · answered May 24 '11 at 20:41

1

It looks like the algorithm is doing something similar to the algorithm mentioned here. The preprocess step looks to be re-ordering the Real input of size N (after padding) to complex input of size N/2. The postprocess step is re-ordering the data to get back the FFT of the original input array.

answered May 24 '11 at 20:41

Pavan Yalamanchili

12,021
2
35
55

Thank you so much, you're right and seems the algorithm is performing exactly what you've linked! Thank you again! – Marco A. May 25 '11 at 17:17
1

You may want to save the information somewhere. I had to use the link as the source because I could not find anything else that explained it in such detail. I found the link in my company's forums :) – Pavan Yalamanchili May 25 '11 at 23:04
Thank you I'll save the information locally, thank you again! – Marco A. May 26 '11 at 05:46

score 0 · Answer 2 · answered May 17 '11 at 08:16

0

spPostprocessC2C looks like a single FFT butterfly. The complexity in the calling routines just comes from fitting the FFT algorithm into a SIMT model for CUDA.

Perhaps if you explained what it is that you are trying to achieve (beyond just understanding how this particular FFT implementation works) then you might get some more specific answers.

answered May 17 '11 at 08:16

Paul R

208,748
37
389
560

Thank you for your answer, the problem is that I am using this code for a thesis and I don't want to end with my prof asking me: "what does this code do?", I need to back myself up just in case luck isn't with me. So I studied the Cooley Tukey algorithm and the twiddles tricks to improve performances and now I'm trying to understand this code but I can't retrieve the concepts in these routines – Marco A. May 17 '11 at 08:29
@Paul: unless your thesis is about FFT implementations I wouldn't have thought that this would matter - it's just a "black box" library routine that you're using to perform some task related to your research. – Paul R May 17 '11 at 08:32
That's what I thought but my prof don't think that either. He's in charge :) – Marco A. May 17 '11 at 08:45
Beware: that's not a FFT butterfly. The FFT is completely done at the point the spxxx functions are called. Seems like a signal pre and post processing before and after point-wise multiplication. I don't get the point of such pre and post processing though.. – Marco A. May 18 '11 at 21:09

CUDA convolutionFFT2D example - I can't understand it

2 Answers2