CUDA FFT plan reuse across multiple 'overlapped' CUDA Stream launches

Question

I'm in trying to improve the performance of my code using asynchronous memory transfer overlapped with GPU computation.

Formerly I had a code where I created an FFT plan, and then make use of it multiple times. In such situation the time invested in creating the CUDA FFT plan is negligible althought according to this earlier post it could be quite significant.

Now that I move to streams, what I'm doing is creating the "same" plan "multiple times" and then setting the CUDA FFT stream. According to the answers given by some of you in this other post this is wasteful. But, is there any other way to do it?

NOTE: I'm acquiring the data in real-time, so launching a "batch" CUDA FFT is out of the question. What I'm doing is to create and lauch a new CUDA stream as a result of a complete pulse transmission.

NOTE 2: I was also considering using a "pool" of "CUDA Streams/FFT Plans" instead but I don't think that would be an elegant, sensible solution, any thoughts?

Is there otherwise a way to "copy" an "existent" fft plan before I assign the CUDA Stream?

Thanks guys!/gals? Hopefully meet some of you in San Jose. =)

Omar

Robert Crovella · Accepted Answer · 2015-03-04T17:58:13.140

1

What I'm doing is to create and lauch a new CUDA stream as a result of a complete pulse transmission.

Re-use the streams, rather than creating a new stream each time. Then you can re-use the plan created for that stream ahead of time, and you have no need to recreate the "same" plan on-the-fly.

Perhaps this is what you mean by the pool of streams method. Your criticism is that it is not "elegant" or "sensible". I have no idea what that means. Stream re-use in pipelined algorithms is a common tactic, if for no other reason than to avoid the cudaStreamCreate overhead (whatever it may be, large or small).

A cufft plan has a stream associated with it. You cannot copy a plan without the stream association. A plan is an opaque container.

edited Mar 04 '15 at 17:58

answered Mar 04 '15 at 17:34

Robert Crovella

143,785
11
213
257

Hello Robert, Thanks! And yes, you're right. I was just being lazy. Lazy programmers like me often think long lines of code are not "elegant". Also I was wondering what was the reason to have the plan holding the stream as an attribute. I have seen this is handle different when using Thrust transforms. There streams are given as a parameter together with the fuctor (execution plan). It would be cool to have a tighter integration of Thrust and CUDA FFT. Don't you think? – Omar Valerio Mar 05 '15 at 08:07

CUDA FFT plan reuse across multiple 'overlapped' CUDA Stream launches

1 Answers1