16

I've discovered an issue impacting several unit tests at my work, which only happens when the unit tests are run with valgrind, in that the value returned from std::cos and std::sin are different for identical inputs depending on if the unit test is run in isolation versus run under valgrind.

This issue only seems to happen for some specific inputs, because many unit tests pass which run through the same code.

Here's a minimally reproducible example (slightly worsened so that my compiler wouldn't optimize away any of the logic):

#include <complex>
#include <iomanip>
#include <iostream>

int main()
{
    std::complex<long double> input(0,0), output(0,0);

    input = std::complex<long double>(39.21460183660255L, -40);
    std::cout << "input: " << std::setprecision(20) << input << std::endl;

    output = std::cos(input);
    std::cout << "output: " << std::setprecision(20) << output << std::endl;

    if (std::abs(output) < 5.0)
    {
        std::cout << "TEST FAIL" << std::endl;
        return 1;
    }

    std::cout << "TEST PASS" << std::endl;
    return 0;
}

Output when run normally:

input: (39.21460183660254728,-40)
output: (6505830161375283.1118,117512680740825220.91)
TEST PASS

Output when run under valgrind:

input: (39.21460183660254728,-40)
output: (0.18053126362312540976,3.2608771240037195405)
TEST FAIL

Notes:

  • OS: Red Hat Enterprise Linux 7
  • Compiler: Intel OneAPI 2022 Next generation DPP/C++ Compiler
  • Valgrind: 3.20 (built with same compiler), also occurred on official distribution of 3.17
  • Issue did not manifest when unit tests were built with GCC-7 (cannot go back to that compiler) or GCC-11 (another larger bug with boost prevents us from using this with valgrind)
  • -O0/1/2/3 make no difference on this issue
  • only compiler flag I have set is "-fp-speculation=safe", which otherwise if unset causes numerical precision issues in other unit tests

Is there any better ways I can figure out what's going on to resolve this situation, or should I submit a bug report to valgrind? I hope this issue is benign but I want to be able to trust my valgrind output.

phuclv
  • 37,963
  • 15
  • 156
  • 475
Yattabyte
  • 1,280
  • 14
  • 28
  • 1
    does the compiler (without valgrind) give the same output with `-O0` `-O2` and `-O3`? – bolov Jan 11 '23 at 23:00
  • 2
    Is `-ffast-math`, `-Ofast` or any other "make floating point math fast" option enabled in one build but not the other? Please show _all_ the compiler options you use in both cases. – Ted Lyngmo Jan 11 '23 at 23:04
  • @TedLyngmo precise math is enabled. Too many issues with fast math with this compiler. -O3 is used, but the compiler still enables a bunch of its own optimizations by default. – Yattabyte Jan 11 '23 at 23:09
  • @bolov I'll repeat the test with different optimization levels and report back – Yattabyte Jan 11 '23 at 23:09
  • 1
    If you show the exact set of options used we could perhaps help rule out that it's any of those that makes it behave strangely when valgrind is running the program. Far fetched perhaps, but, you never know. Edit: I just tested it myself with `g++ -o valtry valtry.cpp` and it does indeed give two different outputs. – Ted Lyngmo Jan 11 '23 at 23:12
  • @bolov I tried with -O0/1/2/3 yielding identical output between the normal runs, and identical output between the valgrind runs (that is to say the behaviour is unchanged and still incorrect) – Yattabyte Jan 11 '23 at 23:29
  • 1
    I also gave `clang++` a go. Same results. `-fsanitize=address,undefined` does not affect the result like `valgrind` though. Interesting article: [Herbgrind Part 7: What About Square Root?](http://alex.uwplse.org/2017/08/05/herbgrind-7-what-about-square-root.html) – Ted Lyngmo Jan 11 '23 at 23:32
  • 1
    @TedLyngmo regarding additional compiler flags, I only have 1 thing set which is required as the intel compilers are overly aggressive with floating point optimizations: "-fp-speculation=safe". Edit: The intel compiler I'm using is based off of clang these days, so I suppose your results make sense. – Yattabyte Jan 11 '23 at 23:36
  • 2
    Do the normal and valgrind outputs differ if you use `double` instead of `long double`? Do they differ if you set the imaginary part of `input` to 0? Do they differ if you set the imaginary part to 0 and the real part to 0? To π/2? To π? To 2π? To 8? 16? 32? By the way, why are you using a `double` constant with a `long double`, `39.21460183660255` instead of `39.21460183660255L`? – Eric Postpischil Jan 11 '23 at 23:47
  • @LayneBernardo: It is in the complex domain. cos(z) is (e^(iz)+e^(−iz))/2. – Eric Postpischil Jan 11 '23 at 23:47
  • @EricPostpischil Yup, that was it lol – Layne Bernardo Jan 11 '23 at 23:51
  • @LayneBernardo this is a contrived example reduced from a working algorithm. The cosine result was the first part of the algorithm that yielded a different result when run with valgrind given a very specific set of inputs – Yattabyte Jan 11 '23 at 23:52
  • @EricPostpischil good catch on the missing long, that was an oversight on my part. Doesn't change the behaviour of the issue sadly, but I'll update the example in my post. I can try changing the components of the input, however i know that many different inputs /do work/, just this (and a few other) inputs cause this issue. To clarify, the value chosen in this example is the result of a computation in an algorithm I cannot share, which is correct too with valgrind, but when consumed by cos/sin give the wrong results with valgrind. – Yattabyte Jan 11 '23 at 23:59
  • @Yattabyte That was Eric :) – Layne Bernardo Jan 12 '23 at 00:02
  • It is absolutely a bug in valgrind -- it's not supposed to change the behavior of your program in such a way. As long as you run the same binary, that is. – Yakov Galka Jan 12 '23 at 03:57
  • 7
    This sounds like a duplicate of this https://bugs.kde.org/show_bug.cgi?id=333625 issue. It needs debugging down to the opcode level to see where the difference is. The most likely thing is that ICC is using an opcode that the Valgrind VM doesn't correctly emulate. – Paul Floyd Jan 12 '23 at 08:53
  • 1
    And most likely a) only working at 64 bit precision and b) not correctly converting to whatever precision ICC is using – Paul Floyd Jan 12 '23 at 09:21

0 Answers0