4

Is it possible to control which CPU instruction sets are used by the MS C Runtime Library (Visual Studio 2013, 2015)? If I step into the disassembly for, say, cos(), the code compares against a precalculated set of CPU capabilities and then executes the function using the 'best' capabilities available on the CPU. The problem is that different instruction sets yield different results, so the results differ depending on the CPU architecture.

As an example, building a 64-bit executable of:

std::cout << std::setprecision(20) << cos(-0.61385470201194381) << std:: endl;

On Haswell/Broadwell and later returns 0.81743370050726594 (same as x86). On older CPUs returns 0.81743370050726583.

The Runtime Library uses the FMA instruction set if available, executes a different implementation and yields the different results. Note that this is not affected by the compiler options selected in the application because the Runtime Libraries are provided pre-compiled. Also note that the floating point precision control function _controlfp() cannot control the precision of the 64-bit runtime.

Is it possible to control which instruction sets the Runtime Library uses so that the results can be more deterministic?

TrickiDicki
  • 143
  • 2
  • 13
  • 1
    Have you tried `/fp:strict` ? See [MSDN](https://msdn.microsoft.com/en-us/library/e7s85ffb.aspx) for all the possible options and pragmas (to control behavior on a per-function basis) – Ben Voigt Feb 01 '16 at 05:59
  • If you are getting *wildly* different results from 64 vs 80 bit floating point then your code is not working correctly in any case. You'll need to go through your calculations and estimate the error of each one, then arrange their order to minimize error propagation. Your error should only be in the 62 or 63 bit range, and you should be able to round to that at the end and get the same results. – Zan Lynx Feb 01 '16 at 06:02
  • 1
    @Zan: That doesn't look wildly different, it looks like -1 ULP – Ben Voigt Feb 01 '16 at 06:03
  • http://floating-point-gui.de/errors/propagation/ – Zan Lynx Feb 01 '16 at 06:03
  • http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html – Zan Lynx Feb 01 '16 at 06:04
  • @BenVoigt: I suppose, depending on the equations, it might be as low as 30 bit accuracy out of 64, if square roots or such-like are involved. – Zan Lynx Feb 01 '16 at 06:10
  • Point is, the user should do the analysis and know that only 30 bits are good (if they are) and only use that much, the extra 34 bits would only be noise. – Zan Lynx Feb 01 '16 at 06:11
  • 1
    Even the same instructions can yield different answers on different machines with floating point. – David Heffernan Feb 01 '16 at 06:38
  • 5
    @ZanLynx I think that the question here is about reproducibility of floating-point, which is quite orthogonal to accuracy of floating-point. As such Goldberg's document, which does not concern itself with implementation details, would be a long unhelpful read, and advice to estimate or reduce the error doesn't help either. Unless the error can be shown to be limited to 0.5ULP, evaluating or improving accuracy does not give reproducibility. – Pascal Cuoq Feb 01 '16 at 09:42
  • @PascalCuoq: If only the accurate bits are used then the result should be reproducible. – Zan Lynx Feb 01 '16 at 15:00
  • @ZanLynx Unfortunately, this isn't true. This is called the table-maker's dilemma. http://perso.ens-lyon.fr/jean-michel.muller/Intro-to-TMD.htm – Pascal Cuoq Feb 01 '16 at 15:36
  • As @PascalCuoq says, the issue is about reproducibility, not accuracy. I want to get the same result from one machine to another. I can do so using, for instance, Boost::Multiprecision however performance will suffer (being purely software) and changes to the source code would be required. I am investigating using the AMD LibM library and so far I get consistent results on both hardware sets. – TrickiDicki Feb 02 '16 at 01:02

1 Answers1

0

Is it possible to control which instruction sets the Runtime Library uses so that the results can be more deterministic?

No.

If you only use basic arithmetic (+,-,*,/,sqrt), and force your compiler to use strict IEEE754 arithmetic, then it should be perfectly reproducible. For other functions, such as cos you're at the mercy of the libm, which are not required to provide any accuracy guarantees. You will also see similar problems with BLAS libraries.

If you need perfect reproducibility, you have 2 paths:

  1. Use a correctly-rounded math library, such as CRlibm (though I don't think the 2-argument functions such as pow have been proven correct).
  2. Roll your own math functions, limiting yourself to arithmetic operations above (in that case, fdlibm might be a good start).
Simon Byrne
  • 7,694
  • 1
  • 26
  • 50
  • In the end we opted for option (2.a) - implement our own lib that implemented the particular functions at issue (the trig functions) which directly called the x87 instruction via inline assembler. Then we linked our project including the new set of math functions which were used in preference to the MS runtime variants. – TrickiDicki Jun 20 '17 at 06:39