6

I have configuration: Intel(R) Core(TM) i7-4702MQ CPU (with Haswell architecture), Windows 8, Intel C++ Compiller XE 13.0. I want run my program with avx2 optimization and put compilation flags:

/QaxCORE-AVX2, /QxCORE-AVX2

But when I run the program I get error:

Fatal Error: This program was not built to run in your system. Please verify that both the operating system and the processor support Intel(R) AVX2, BMI, LZCNT, HLE, RTM, and FMA instructions.

I run avx2 cpu support test which is given on page: How to detect new instruction support in the 4th generation Intel Core processor family. Result:

This CPU supports ISA extensions introduced in Haswell.

How can I check that my operating system support avx2-extensions and what could be the cause of the error? For use avx2 extensions i need set both /QaxCORE-AVX2 and /QxCORE-AVX2 flags?

upd: if i set flag

/QxAVX

that program has been successfully launched.

daniel.heydebreck
  • 768
  • 14
  • 22
Konstantin Isupov
  • 199
  • 1
  • 2
  • 12
  • Perhaps XSAVE is disabled. I'm not sure how to enable it, but it's probably a boot configuration. – Mysticial Sep 13 '14 at 06:41
  • @Mysticial i create simple win32 project with function IsProcessorFeaturePresent(PF_XSAVE_ENABLED). This function return True. – Konstantin Isupov Sep 13 '14 at 07:20
  • check when ? At compile time ? At runtime ? Anyway this is more like a market related issue, for example the 2955U is a Celeron formally based on the Haswell architecture but it doesn't even offer the first generation of AVX . – user2485710 Sep 13 '14 at 17:52
  • @user2485710 At runtime. On http://ark.intel.com/products/75119 says that AVX2 is support. On http://ark.intel.com/products/75608/Intel-Celeron-Processor-2955U-2M-Cache-1_40-GHz says that Celeron support only sse. – Konstantin Isupov Sep 14 '14 at 11:15
  • 1
    Probably you're running in a virtual machine that didn't expose AVX2 to the guest. i7-4702MQ definitely supports AVX and AVX2 (and the other extensions the error message complained about.) – Peter Cordes Sep 16 '17 at 05:35
  • Also see [How to detect New Instruction support in the 4th generation Intel® Core™ processor family](https://software.intel.com/en-us/articles/how-to-detect-new-instruction-support-in-the-4th-generation-intel-core-processor-family) on the Intel blogs. – jww Jun 18 '18 at 20:26
  • I'd be worried about HLE and RTM: Those are disabled on Haswell by microcode. (And even on Skylake-family, HLE is I think permanently disabled by microcode updates as of late 2020 :/). BMI and LZCNT don't need OS support; they don't have new architectural state so there's no bit the OS has to set for them to decode instead of fault. – Peter Cordes Apr 16 '21 at 02:38

1 Answers1

9

If you want to check for the support to a particular set of registers you basically have 2 options:

  • assembly with CPUid extensions
  • builtin functions ( if ) provided by your compiler

writing assembly that detects which sets of registers are supported is a tedious, long and potentially error-prone task, not to mention that assembly it's not portable across different OSs, different SoC and different ABIs, there is also the burden of CPUid instructions which are not always implemented with the same pattern in all CPUs, there are different ways to do the reach the same bit of information with different vendors or even different family of CPUs from the same vendor; but this has one big advantage, it's not limited by anything, if you really need to know anything about your CPU/SoC, assembly + CPUid related stuff is the way to go .

Now gcc and other compilers implement something for your basic needs when you have to investigate your cpu capabilities in the form of builtin functions, which means that this special functions will generate the equivalent code in assembly and give you the answer that you want.

Using gcc, the check for AVX2 is as easy as writing

...
if(__builtin_cpu_supports("avx2"))
{
  ...
}
...

docs : http://gcc.gnu.org/onlinedocs/gcc/X86-Built-in-Functions.html

for Visual Studio / msvc there are intrinsics such as __cpuid and __cpuidex that you can use to retrieve the same informations, here it is a link with a complete and working example .

docs : http://msdn.microsoft.com/en-us/library/hskdteyh.aspx

user2485710
  • 9,451
  • 13
  • 58
  • 102