3

According to Herb Sutter's article http://herbsutter.com/2008/01/01/gotw-88-a-candidate-for-the-most-important-const/, the following code is correct:

#include <iostream>
#include <vector>
using namespace std;

vector<vector<int>> f() { return {{1},{2},{3},{4},{5}}; }

int main()
{
    const auto& v = f();
    cout << v[3][0] << endl;
}

i.e. the lifetime of v is extended to the lifetime of the v const reference. And indeed this compiles fine with gcc and clang and runs without leaks according to valgrind.

However, when I change the main function thusly:

int main()
{
    const auto& v = f()[3];
    cout << v[0] << endl;
}

it still compiles but valgrind warns me of invalid reads in the second line of the function due to the fact that the memory was free'd in the first line.

Is this standard compliant behaviour or could this be a bug in both g++ (4.7.2) and clang (3.5.0-1~exp1)?

If it is standard compliant, it seems pretty weird to me... oh well.

mwnx
  • 97
  • 4
  • its obviously a bug in valgrind - this memory is not freed until the process exits. – specializt Jan 10 '15 at 16:14
  • 3
    It's standard compliant. The temporary returned by `f()` is not bound to a reference, so its lifetime is not extended. – T.C. Jan 10 '15 at 16:18
  • @specializt: Not true at all. The memory allocated dynamically by the vector is freed when the vector went out of scope at the end of the reference declaration line. – Lightness Races in Orbit Jan 10 '15 at 16:39
  • 1
    The real question is: when would one ever want to do something like this? – James Kanze Jan 10 '15 at 17:12
  • @JamesKanze: When you don't care whether the function returns a value or a reference (e.g. when using/writing template code) but you just want to access the result in a read-only manner. – mwnx Jan 10 '15 at 17:42
  • @RagnarL Is there ever a case where you wouldn't know the actual return type of the function? If the function returns a reference, you need to know it, and know the guaranteed lifetime of the referred to object. – James Kanze Jan 10 '15 at 17:57
  • @JamesKanze: Please check the article I linked to in my question; it explains a special rule regarding binding to const references. Using this special rule, one can write more generic code. – mwnx Jan 10 '15 at 19:12
  • 1
    @RagnarL I'm familiar with the article. It doesn't addresse the issue of when this idiom is appropriate: if the called function returns a value, it's a pessimisation (and a bit of obfuscation). If the called function returns a reference, then unless you know this, _and_ know the lifetime of the referenced object, you're skating on thin ice. If you don't know this, then you're risking a dangling reference. – James Kanze Jan 10 '15 at 19:21
  • @JamesKanze: I put together a concrete(ish) example for you: http://coliru.stacked-crooked.com/a/b132695b8706716a There are no leaks. – mwnx Jan 11 '15 at 15:14
  • @RagnarL That's a perfect example of what I was saying. Why is the local variable a reference? Using a reference for a local variable in this case is an anti-pattern. – James Kanze Jan 13 '15 at 13:36
  • @JamesKanze: You'll have to elaborate. An anti-pattern according to who or what? I really don't see what's so (objectively) wrong in my example. – mwnx Jan 14 '15 at 20:22
  • 1
    @RagnarL You're using a reference when you need a value, for starters. (Using `auto` in this case is also an anti-pattern, and a heavy dose of obfuscation.) – James Kanze Jan 15 '15 at 11:10
  • @JamesKanze: Please... stop justifying your opinions by just blurting out the word anti-pattern. Concerning your so-proclaimed overuse-of-auto anti-pattern, let me refer you to this well argued article by Herb Sutter: http://herbsutter.com/2013/08/12/gotw-94-solution-aaa-style-almost-always-auto/. As you can see, your opinion is not the only one. (and this thread is getting much too long...) – mwnx Jan 15 '15 at 18:50
  • 1
    @RagnarL The justification for not overusing `auto` is that it is obfuscation. Herb is clearly wrong on this one, and his arguments don't hold water. One vital reason for using C++ on large projects (rather than, say Python) is that it has static type checking. The justification for not using the reference here is that it is just excess baggage: if the function returns a value, it's only effect is to slow the code down, and if the function returns a reference, it means that you need to do additional analysis concerning lifetime. – James Kanze Jan 16 '15 at 12:33
  • @JamesKanze: You're much too sure of yourself and you're wrong. Concerning auto, go check out other statically (strongly) typed languages with **type inference** like OCaml or Haskell. In these languages, the programmer almost never specifies the types of local variables, because there's no need to. The compiler deduces the types and checks for type-correctness itself. In other words you have an incomplete and outdated understanding of the advantages of a type system; a C programmer's point of view. As for the "most important const" thing, I explained already. This is my last comment here. – mwnx Jan 16 '15 at 12:44
  • 1
    @RagnarL Obfuscation is obfuscation. Neither OCaml nor Haskell are widely used in large projects, perhaps because they result in obfuscated code. (I don't know.) Neither is C++, however, so neither is relevant here. C++ works in a certain way, and explicit typing is part of that way. Throw out explicit typing (in general), and you've weakened the language, and made the code more fragile and harder to maintain, because it is significantly less readable. – James Kanze Jan 17 '15 at 11:09

1 Answers1

11

There's no bug here except in your code.

The first example works because, when you bind the result of f() to v, you extend the lifetime of that result.

In the second example you don't bind the result of f() to anything, so its lifetime is not extended. Binding to a subobject of it would count:

[C++11: 12.2/5]: The second context is when a reference is bound to a temporary. The temporary to which the reference is bound or the temporary that is the complete object of a subobject to which the reference is bound persists for the lifetime of the reference except: [..]

…but you're not doing that: you're binding to the result of calling a member function (e.g. operator[]) on the object, and that result is not a data member of the vector!

(Notably, if you had an std::array rather than an std::vector, then the code would be absolutely fine as array data is stored locally, so elements are subobjects.)

So, you have a dangling reference to a logical element of the original result of f() which has long gone out of scope.

Sorry for the horrid initializers but, well, blame C++.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • Thanks. This clears things up. Still, forgetting about the standard, I wonder if the compiler couldn't be able to move the result of the member function call or something, or would there be some nasty edge cases? – mwnx Jan 10 '15 at 16:39
  • @RagnarL: Your compiler would have to somehow magically discover which object was responsible for obtaining some given resources, and then keep that object alive so the resources aren't freed. I don't see how it can guarantee to achieve that in the general case and, even if it did, it would become very confusing as to when certain objects actually get destroyed. This way it's simple. – Lightness Races in Orbit Jan 10 '15 at 16:41
  • 1
    *[quick and dirty brace elision](http://coliru.stacked-crooked.com/a/7005990e1b0822a8)* - blame c++ anyways :) More exact: [with some grouping, some elided](http://coliru.stacked-crooked.com/a/a7c38c25a769c7e0) and **[-Wall -pedantic proof](http://coliru.stacked-crooked.com/a/c400250f7719e951)** Pick your poison. But, at least it's not all _horrid_ – sehe Jan 10 '15 at 16:45