2

I've developed C code for a 3-dimensional FFT (MKL interface) to run natively on an Intel MIC platform.

Data elements are double precision complex for a complex-to-complex transform. I'm using a padded leading dimension, mkl_malloc() 64-byte alignment, and using radix-2 dimensions for the array The performance I end up with is around 50 Gflop/s.

I can't performance listings anywhere for similar types of transforms. Can anyone tell me if this reasonable (to be satisfied with) on Xeon Phi?

Adel Khayata
  • 2,717
  • 10
  • 28
  • 46

1 Answers1

2

Your result looks ok.

A FFT tuning guide with 2-D float data on Xeon Phi provided by Intel shows a peak performance of 100Gflops. So 50Gflops on double data should be reasonable.

Besides the factors mentioned in your question, other things include leading dimensions, padding, thread number and affinity also have large effect on the performance.

You could refer to these Intel docs for more info.

doc list for MKL on xeon phi

http://software.intel.com/en-us/articles/intel-mkl-on-the-intel-xeon-phi-coprocessors

performance tips of using MKL on xeon phi

http://software.intel.com/en-us/articles/performance-tips-of-using-intel-mkl-on-intel-xeon-phi-coprocessor

tuning dft functions on xeon phi

http://software.intel.com/en-us/articles/tuning-the-intel-mkl-dft-functions-performance-on-intel-xeon-phi-coprocessors

BenMorel
  • 34,448
  • 50
  • 182
  • 322
kangshiyin
  • 9,681
  • 1
  • 17
  • 29