5

I have discovered that the Intel compiler does not generate return value optimization for std::array objects. The following code, which happen to be in the inner loop of my program is not optimized as it could.

std::array<double, 45> f(const std::array<double, 45>& y) {
    auto dy_dt = std::array<double, 45>( );
    ...

    return dy_dt;
}

I have figured out that this behaviour comes from the fact that my standard library implementation does not explicitly define a copy constructor for std::array. The following code demonstrates that:

class Test {
public:
    Test() = default;
    Test(const Test& x);
};

Test f() {
    auto x = Test( );

    return x;
}

When you compile it with

icpc -c -std=c++11 -qopt-report=2 test.cpp -o test.o

the report file shows

INLINE REPORT: (f(Test *)) [1] main.cpp(7,10)

which proves that the compiler generates RVO (the signature of f is changed so it can put the newly created object on the stack of the calling site). But if you comment out the line that declares Test(const Test& x);, the report file shows

INLINE REPORT: (f()) [1] main.cpp(7,10)

which proves that RVO is not generated.

In 12.8.31 of the C++11 standard that defines RVO, the example they give has a copy constructor. So, is this a "bug" of the Intel compiler or a conforming implementation of the standard?

InsideLoop
  • 6,063
  • 2
  • 28
  • 55
  • @Cyber No, if RVO was possible it would elide the copy. – juanchopanza Mar 04 '15 at 13:00
  • Cyber: No, this is named return value optimization that is involved here. It has nothing to do with move semantics. – InsideLoop Mar 04 '15 at 13:01
  • RVO doesn't need a copy constructor per se, but code for which RVO would apply needs a valid copy or move constructor to be available. – juanchopanza Mar 04 '15 at 13:03
  • @juanchopanza: Is the "compiler generated" copy constructor considered as valid ? – InsideLoop Mar 04 '15 at 13:05
  • Yes, of course. Otherwise your code wouldn't have compiled after you commented out your copy constructor. – juanchopanza Mar 04 '15 at 13:05
  • juanchopanza: In which situations would you get an invalid copy constructor? Do you think it is a bug of the Intel compiler? – InsideLoop Mar 04 '15 at 13:06
  • If you don't explicitly have a copy constructor, the compiler should generate one automatically. Unless there's some sneaky clause in the standard, I think RVO may happen regardless of whether you explicitly have a copy constructor or not. However, it is not a bug. The standard doesn't say that RVO must happen, just that it may happen. – thang Mar 04 '15 at 13:07
  • 2
    If you say `Test(const Test& x) = delete;` then you no longer have an available, valid copy constructor. – juanchopanza Mar 04 '15 at 13:08
  • ethang: Let's call it "an optimization bug" even though it is conforming with the standard. The Intel compiler is heavily used in HPC, and such an omissions should be considered as a bug. – InsideLoop Mar 04 '15 at 13:08
  • juanchopanza: Thanks. – InsideLoop Mar 04 '15 at 13:09
  • 1
    What I was saying is almost a tautology: RVO cannot change whether some code is valid, and to return an object by value it has to be from a type that is copyable/movable. So you need copy/move constructors regardless of whether RVO happens. – juanchopanza Mar 04 '15 at 13:12
  • juanchopanza: I have never thought about erasing a copy constructor ;-) Anyway, I've filled a bug with the Intel compiler. – InsideLoop Mar 04 '15 at 13:18
  • The example you put in your comment on my now-deleted answer was a good one, please put that in your question. For you actually have in your question, an ABI enabling RVO doesn't make sense. For your real code, Intel does use an ABI enabling RVO, but doesn't actually perform RVO. –  Mar 05 '15 at 08:43
  • @hvd: I have updated the question so it is more clear. Thanks. – InsideLoop Mar 05 '15 at 08:50

1 Answers1

1

This program causes undefined behaviour with no diagnostic required, due to violation of the One Definition Rule.

A copy-constructor is odr-used when returning by value -- even if copy elision takes place.

A non-inline function being odr-used means that exactly one definition of the function must appear in the program. However you provided none, and your declaration of the copy-constructor suppresses the compiler-generated definition.

M.M
  • 138,810
  • 21
  • 208
  • 365
  • I think it's probably because not all of the code is shown. He hasn't provided a main either. Hypothetically, there could be another file test2.cpp, not shown or mentioned, that has the definition of that copy constructor. Or it could be in main.cpp. – thang Mar 04 '15 at 13:41
  • @Matt: I don't understand your point. If you don't define a copy constructor, the standard defines one for you. – InsideLoop Mar 04 '15 at 15:37
  • @InsideLoop no; if you don't *declare* a copy-constructor, then the compiler declares and defines one for you. Also, as ethang points out, it'd be useful if you posted a valid program (or an attempt at one). You can't file an optimization bug report based on a program that doesn't compile. – M.M Mar 04 '15 at 19:11
  • 1
    @Matt: It does compile as a compilation unit and Intel is investigating the problem: https://software.intel.com/en-us/forums/topic/542477 – InsideLoop Mar 05 '15 at 08:04
  • @InsideLoop not as a complete program however; it'd be more meaningful if you could observe the same symptoms in a complete program – M.M Mar 05 '15 at 09:39
  • @Matt: I don't understand your point. I do observe this in a complete program. I don't understand why you want to put a complete program here. – InsideLoop Mar 05 '15 at 09:56
  • 1
    Matt's point is to provide a complete sample of code that exhibits your problem, rather than expecting other people to guess what you've left out. It might be true that you observe the problem in a complete program, but you have not posted a complete program. Generally speaking, posting a small but complete sample of code, that actually exhibits your problem, increases chances that folks can help you. Leaving bits out does not help people help you. – Rob Mar 05 '15 at 10:08
  • @InsideLoop What you've posted so far has undefined behaviour so it would not be a compiler bug for the compiler to do what you are seeing or anything else. – M.M Mar 05 '15 at 11:02
  • Matt: Where do you see undefined behaviour? Is it because there is no main? – InsideLoop Mar 05 '15 at 14:35
  • @InsideLoop As explained in my answer, there is no definition for the copy constructor (as well as there being no main). I'd suggest fixing both of those problems and seeing if the problem still occurs. – M.M Mar 05 '15 at 22:45