7

Here is a short program to count the number of divisors of an integer. The program does work correctly. The problem is, however, that under the -O3 optimization flag of the current trunk of the Clang C++ compiler (version 3.3, trunk 180686) the behavior of the program changes and the result is no longer correct.

Code

Here's the code:

#include <iostream>

constexpr unsigned long divisors(unsigned long n, unsigned long c)
{
    // This is supposed to sum 1 anytime a divisor shows up
    // in the recursion
    return !c ? 0 : !(n % c) + divisors(n, c - 1);
}

int main()
{
    // Here I print the number of divisors of 9 numbers! (from 1 to 9)
    for (unsigned long i = 1; i < 10; ++i)
        std::cout << i << " has " << divisors(i, i) << " divisors" << std::endl;
}

Correct Behavior

Here's the compile command used, and the correct and expected output, which the program exhibits under normal circumstances:

clang++ -O2 -std=c++11 -stdlib=libc++ -lcxxrt -ldl sample.cpp -o sample
./sample 
1 has 1 divisors
2 has 2 divisors
3 has 2 divisors
4 has 3 divisors
5 has 2 divisors
6 has 4 divisors
7 has 2 divisors
8 has 4 divisors
9 has 3 divisors

Incorrect Behavior

This is the command line used that produces the binary that gives incorrect output. Notice that the only change is the optimization flag (-O2 to -O3.)

clang++ -O3 -std=c++11 -stdlib=libc++ -lcxxrt -ldl sample.cpp -o sample
./sample 
1 has 1 divisors
2 has 2 divisors
3 has 2 divisors
4 has 1 divisors
5 has 2 divisors
6 has 3 divisors
7 has 2 divisors
8 has 2 divisors
9 has 2 divisors

EDIT

I've updated to tip of trunk, clang version 3.4 (trunk 183073). The behavior does not reproduce anymore, it should have been fixed somehow already. Anyone who knows what issue was it, if there was one actually verified and fixed, please feel free to provide an answer. If there was none verified, a regression may happen.

oblitum
  • 11,380
  • 6
  • 54
  • 120

1 Answers1

6

Looks like you got bitten by this bug in llvm. You can work around it by disabling the loop vectorizer, or (as you've already found), by updating to an llvm builit at a revision newer than r181286.

If you check out the diffs, you'll see a test case has been added as part of the fix. That should keep this problem from cropping up again in the future.

Carl Norum
  • 219,201
  • 40
  • 422
  • 469
  • did you bisect? was it curbersome? because I could not realize bisecting it at my machine in a timely manner. – oblitum Jun 02 '13 at 05:05
  • I did, and it wasn't too bad. Are you using git or svn? – Carl Norum Jun 02 '13 at 05:06
  • currently I'm with the svn repo for llvm, but I would go for git if I would do it for sure, but even with git, I was more concerned about the build times, bisection after bisection. – oblitum Jun 02 '13 at 05:08
  • llvm+clang (which is all I tested) takes less than ten minutes even to build from scratch on my machine (17" macbook pro). Maybe a bit longer? But in general it's not too bad. The bisection steps don't always require a complete rebuild, either, so normally it's faster than that. Overall it wasn't too bad - I just kind of did it in the background while doing other stuff around the house today. – Carl Norum Jun 02 '13 at 05:13
  • Ok, never done this with heavy stuff, I think I will try it just for the experience. – oblitum Jun 02 '13 at 05:16
  • The gotcha with llvm/clang is that they're separate projects. So what I really did was bisect the llvm tree, and manually whacked the clang project to the most recent matching commit. I had to do both because I didn't know if the bug was in llvm or in clang - since you have that information already, you might not have to move the clang tree's HEAD around so much. – Carl Norum Jun 02 '13 at 05:17
  • Better. A 13" macbook pro, I think, won't handle it that easely. – oblitum Jun 02 '13 at 05:19
  • There are some projects out there that have a unified tree. If you clone one of those, you might have an easier job bisecting. In fact, you could even totally automate it by using `git bisect run`. – Carl Norum Jun 02 '13 at 05:22