0

I have a loop in which I have an FFT and an inverse FFT. Since the vector of the k+1-th iteration I apply the FFT on depends on the data of the k-th iteration, I cannot use a gfor loop to parallelize the program. Therefore, I wanted to speed up the FFTs in the loop. I switched from MATLAB (which uses the FFTW library) to Arrayfire. However, it is not faster, but slower a bit.

P.S: Timing in Arrayfire is performed by the timeit function. On the other hand, the elapsed time returned by it is much shorter than the time I experience looking at the command prompt.

Does anyone know an explanation to these 2 problems? Thank you.

Zoltán Csáti
  • 679
  • 5
  • 17
  • Which backend and version of ArrayFire are you using? Also, did you build it from scratch or downloaded the installer from the website? – shehzan Jun 10 '15 at 15:00
  • I am using the CPU backend with the downloaded installer, that is what I can compare with my MATLAB code relying on my CPU. Otherwise, how can it be that any benchmark using the `timeit` function reports different (lower) execution time than I experience from the printing of it? – Zoltán Csáti Jun 15 '15 at 06:02
  • TimeIT runs the given function many number of times and then takes the average, it is a much more accurate way to benchmark ArrayFire. Operations like memory allocation for input/output are not a part of the function and must not be counted. Running timeit eliminates this overhead because the memory manager can reuse the previously allocated memory for subsequent runs. The time from timeit is purely the function time. You can replicate this by running the function in a for-loop and then taking the average of the time. – shehzan Jun 16 '15 at 14:50
  • All right, I perceived the essence of `timeit`. But is there a way to speed-up the calculations (i.e. I have a dependent loop in which I calculate `FFT`s and `iFFT`s)? I tried out the `eval` function at the end of the iterations as I saw it in your `fractal` example, but it didn't help. Somehow I should decrease the overhead perhaps. – Zoltán Csáti Jun 17 '15 at 06:05
  • I don't quite understand what you mean by "speed-up the calculation" and the loop that you are using. The eval() function is essentially a blocker, it waits for all execution before that point to finish. – shehzan Jun 18 '15 at 15:45
  • Thanks for explaining the meaning of `eval()`. In between I realised that I cannot write the loop to be independent so I must use the `for` loop instead of `gfor`. What consequences does it have? Does it mean that in every iteration there is data transfer between the CPU and the GPU? – Zoltán Csáti Jun 19 '15 at 06:00
  • If you do not ask for the data to be copied to host, ie. using the host function, the data is always on the device. ArrayFire does not copy data between host and device unless explicit API are used. If you wish to talk in more detail about how ArrayFire work, I suggest you use the user forum https://groups.google.com/forum/#!forum/arrayfire-users – shehzan Jun 20 '15 at 16:40
  • "ArrayFire does not copy data between host and device unless explicit API are used.". I was enlightened. And also thanks for the forum link. – Zoltán Csáti Jun 22 '15 at 08:17

0 Answers0