Is it possible to use Xeon Phi by just launching many threads, or there are special type of programming required to use Xeon Phi?
2 Answers
Intel have some fairly good math libraries, IPP / MKL. Reading between the lines of what Xeon Phi seems to be I imagine that Intel have a version of those libraries that would exploit the very wide SIMD unit that appears to have become part of the architecture.
Intel's compiler will also put in multiple threads to execute for loops in parallel instead of in sequence. That would be one way of exploiting the large number of cores that Phi seems to have.
So it could be that with the right compiler and libraries programming for Phi could be fairly normal, until you start needing routines that the libraries haven't got.
-
Do you have any sources that using ICC will automatically use multiple threads on Xeon Phi, even when not using parallel directives (like the ones I've mentioned in my answer)? I know it will vectorize (use the wide SIMD), but as far as I know it won't use multiple threads. Will be happy to change my vote if you can show it does :) – Oak Jun 27 '13 at 06:34
-
@Oak: http://software.intel.com/en-us/articles/automatic-parallelization-with-intel-compilers. It would be astonishing if, having developed this a long time ago, it didn't work on Phi also. The Sun C/C++ compilers on Solaris do the same trick (if you set the right command line options), and they're free. – bazza Jun 27 '13 at 17:58
-
Intel started doing this in 2010, Sun I'm pretty sure had it back in the 1990s! When looking for performance, don't neglect to search for a decent compiler. GCC is not the last word in compilers. – bazza Jun 27 '13 at 18:04
-
According to Wikipedia http://en.wikipedia.org/wiki/Intel_C%2B%2B_Compiler release 13 supports Phi. – bazza Jun 27 '13 at 18:32
-
1Personally I have mixed feelings about auto parallelisation. If it works and produces code that meets whatever one's performance requirement is, then great! However, I prefer to tough it out and optimise my code to the n'th degree myself, which involves getting a real understanding about it and its performance. Auto parallelisation will bring improvements to the code you put into the compiler; it won't do anything about altering that code to make a beneficial structural change. Being forced to understand one's code means that one might spot opportunities for that structural change. – bazza Jun 27 '13 at 18:36
You can read these document for more information on how to tap the many available threads on Xeon Phi:
- http://software.intel.com/en-us/articles/programming-and-compiling-for-intel-many-integrated-core-architecture
- http://software.intel.com/en-us/articles/choosing-the-right-threading-framework
- and more on http://software.intel.com/en-us/mic-developer
To summarize, either manage threads manually (via TBB / pthreads / etc.), or use one of the supported parallel programming models:
- OpenMP
- MPI
- Cilk Plus
- OpenCL
- OpenACC
Or use libraries that can automatically offload to the device, such as MKL or ArrayFire.

- 26,231
- 8
- 93
- 152