Backward compatibility of the code compiled optimized for new instruction set extensions

Question

In order to narrow the scope of this question, let's consider projects in C / C++ only.

There is a whole array of new SIMD instruction set extensions for x86 architecture, though in order to benefit from them a developer should recompile the code with an appropriate optimization flag, and perhaps, modify it accordingly as well.

Since new instruction set extensions come out relatively frequently, it's unclear how the backward compatibility can be maintained while utilizing the benefits of available instruction set extensions.

Is a resulting application stays compatible with the older CPU models that don't support a new institution set extension? If yes, could you elaborate on how such support implemented?

As a question, why are you considering backwards compatibility this way? Why not consider writing portable code efficiently that can be cross compiled onto any of the sources, rather then relying on the backwards compatibility of the architecture? — Fantastic Mr Fox, Dec 21 '17 at 06:07
*"Since new instruction set extensions come out relatively frequently..."* - that feels like an overstatement. And even if true, you typically wait for the next release of compilers to support the new architectures. And these new compilers (e.g. Visual Studio, gcc) typically wouldn't generate such code without the developer passing a specific flag to indicate he wants newer architecture support only. — selbie, Dec 21 '17 at 06:19
Good example: https://blogs.msdn.microsoft.com/vcblog/2014/02/28/avx2-support-in-visual-studio-c-compiler/ — selbie, Dec 21 '17 at 06:20
@FantasticMrFox, that's a good point. Such technique for sure should be considered during development, though seem to result in a number of software distributions for a single architecture that may confuse a the end user of which one to pick. — Pavel, Dec 21 '17 at 06:25

score 3 · Answer 1 · answered Dec 21 '17 at 06:09

New CPU instructions require new hardware to execute. If you try to run them on older CPUs that don't support those instructions, your program will crash with an Invalid Opcode fault. Occasionally OSes will handle this condition, but usually not.

To run with the new instructions, you either need to require that they are supported in hardware, or (if the benefit is great enough) check at runtime to see if the new instructions you need are supported. If they are, you run a section of code that uses them. If they are not, you run a different section of code that does not use them.

Generally "backwards compatible" refers to a new version of something running stuff that runs on the older, existing things, and not old things running with new stuff.

score 2 · Accepted Answer · answered Dec 21 '17 at 06:20

2

Historically, most x86 instruction sets have been (practically) strict supersets of previous sets. However, the AVX-512 extension comes in several mutually-incompatible variants, so particular care will need to be taken.

Fortunately, compilers are also getting smarter. GCC has __attribute__((simd)) and __attribute__((target_clones(...))) to automatically create multiple implementations of the given function, and choose the best one at load time based on what the actual CPU supports. (For older GCC versions, you had to use IFUNC manually ... and in ancient days, ld.so would load libraries from a completely separate directory depending on things like cmov).

answered Dec 21 '17 at 06:20

o11c

15,265
4
50
75

1

AVX-512 doesn't have "mutually incompatible" variants, unless you're thinking of the 512-bit SIMD for first-gen Xeon Phi (Knight's Corner, KNC), which is not official AVX-512 at all. It's its own thing, and no other CPU supports it. (It's *very close* to AVX-512, but it's not.) What AVX512 has is several *compatible* extensions beyond AVX-512F (foundation). See [Wikipedia's CPUs with AVX-512](https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512). Xeon Phi provides one set of extensions, mainstream CPUs provide a different set. – Peter Cordes Dec 21 '17 at 06:42
The common subset is only AVX512F + AVX512CD, but they're all compatible. It's just that no current CPU provides them all. But if you're going to run something on a Xeon Phi, you should compile it specifically for that target with `-march=knl` to tune for it as well, instead of running generically-optimized code on a KNL, so for a lot of things you shouldn't need to worry too much about perfectly handling KNL with dynamic dispatching, just `#ifdef`s. – Peter Cordes Dec 21 '17 at 06:45
1

But for everything else, yes dynamic dispatching in one form or another (dynamic linker tricks, `ifunc`, or pure manual dispatching writing your own `if(cpuid)` or function-pointer setup stuff yourself like the x264 video encoder does) is the right answer to running well across multiple generations of CPUs. – Peter Cordes Dec 21 '17 at 06:47

Backward compatibility of the code compiled optimized for new instruction set extensions

2 Answers2