I'm trying to get the best performance for a cpp algorithm (floating point, several loops and big data) compiled with Intel C++ compiler (icpc) on some Ubuntu 12.04.5 Desktop machines. I've read that the optimizing flag -O3 is usually the best one, but I've read also that there are specific flags for each type of processor. Does anyone know which are the most performant flags for an Intel Xeon E3-1200 series and for an Intel Xeon CPU E5-2430 v2?
Asked
Active
Viewed 438 times
0
-
You'll want to take a look at the instruction set for the processor and tune it to those (for example SSE). – OMGtechy Dec 23 '14 at 12:17
-
`-march=native` assuming you're compiling on the same (family of) processor and don't intend to ever run the binary on any other that doesn't support all it's instructions. – eerorika Dec 23 '14 at 12:24
-
I recompile each time when running the algorithm, so I need custom optimization flags for each processor – user1403546 Dec 23 '14 at 13:30
-
1Didn't you ask the same question before? I've seen that E3-1200 before, and I recall that the question then didn't even mention floating point back then. Yes, these CPU's are both capable of using SSE4, yes the Intel compiler can use SSE4, but that's only a marginal improvement if you don't tackle the algorithm first. Is it even numerically optimal? Is it memory-limited or FLOPS-limited? You seem to expect some silver bullet. There isn't. – MSalters Dec 23 '14 at 13:33
-
Dear @MSalters, previous question was related to g++, this is about icpc, then I've re-written it. I don't know what you mean with "numerically optimal", and don't even understand what you mean with "memory-limited or FLOPS-limited". My IT vocabulary is very small, sorry... – user1403546 Dec 23 '14 at 14:02