I'm currently trying to compile software for the use on a HPC-Cluster using Intel compilers. The login-node, which is where I compile and prepare the computations uses Intel Xeon Gold 6148 Processors, while the compute nodes use either Haswell- (Intel Xeon E5-2660 v3 / Intel Xeon Processor E5-2680 v3) or Skylake-processors (Intel Xeon Gold 6138).
As far as I understand from the links above, my login-node supports Intel SSE4.2, Intel AVX, Intel AVX2, as well as Intel AVX-512 but my compute nodes only support either Intel AVX2 (Haswell) or Intel AVX-512 (Skylake)
If I compile with the option -xHost
on the login node, it should automatically use the highest instruction set available. But which one is the highest? And how can I ensure, that my program runs on both compute-systems with best performance? Do I have to compile two versions?
Bonus question: Which -march
do I have to specify in this case?