20

Can you guess the output of this trivial program?

#include <vector>
#include <string>
#include <exception>
#include <iostream>

int main()
{
    try { 
        struct X {
            explicit X(int) {}
            X(std::string) {} // Just to confuse you more...
        };
        std::vector<X>{"a", "b"};
    } catch (std::exception& x) {
        std::cerr << x.what();
    }
}

Well, I couldn't, which cost me a day of "research", before landing here, with finally having it distilled from some complex real-life code (with type aliases everywhere, anon. unions with non-POD members & hand-orchestrated ctors/dtors etc., just for the vibe).

And... I still can't see what's going on! Can someone please give a gentle hint? (Hopefully just a blind spot. I no longer do C++ professionally.)

Note: clean compile* with both (latest) MSVC /W4 and GCC -Wall; same output in both (semantically).

* Even without the "confuse-the-reader" line. Which I think I'm gonna have nightmares from.


(Please bear with me for trying not to spoiler it by spelling everything out even more — after all, this truly is self-explanatory as-is, right? Except, the exact opposite for me...)

user3840170
  • 26,597
  • 4
  • 30
  • 62
Sz.
  • 3,342
  • 1
  • 30
  • 43
  • From the note at the bottom, I suspect Sz. intends to answer their own question shortly? – Mooing Duck Aug 30 '23 at 21:51
  • @TedLyngmo, thanks, but I'm still puzzled why this compiles at all, especially without the string ctor. That's basically what confuses me, I think. Which implicitconversion is plotting to kill me here? (Or what else, if no that?) – Sz. Aug 30 '23 at 21:51
  • 4
    Change it to `vector{"a", "b", "c"};` to understand the problem. Change it to `vector{X{"a"}, X{"b"}};` to fix the problem. – Eljay Aug 30 '23 at 21:51
  • @MooingDuck -- that was my original intention, indeed! :) But then I realized I still didn't understand it properly. :) – Sz. Aug 30 '23 at 21:52
  • 1
    http://coliru.stacked-crooked.com/a/830465cc50cf110a I'm confused. What problem? This appears to work exactly how I'd expect it to. It constructs a vector with two Xs, one from `"a"` and one from `"b"`. I can't figure out what else you would be expecting to happen? – Mooing Duck Aug 30 '23 at 21:53
  • @Eljay, I have a whole suite of variations I tried, including those you mentioned, ineed. But that blind spot I mentioned still seems to prevent me from seeing the solution. – Sz. Aug 30 '23 at 21:53
  • @MooingDuck the problem is, obviously then, my wrong expectation. I just can't see how those string literals make any sense, in that vector's init list, _especially with X's string ctor removed_. As I said: it must be a blind spot, and I'll probably facepalm myself afterwards... – Sz. Aug 30 '23 at 21:56
  • `X`s string constructor isn't removed. Why would you think it's removed? – Mooing Duck Aug 30 '23 at 21:57
  • @MooingDuck: _"Even without the "confuse-the-reader" line. Which I think I'm gonna have nightmares from."_ Try it without the string ctor, for the extra fun, as I hinted! – Sz. Aug 30 '23 at 21:58
  • 4
    Debuggers are really cool. Here you could have stepped in and seen the program had landed in the wrong `vector` constructor. – user4581301 Aug 30 '23 at 21:58
  • 2
    http://coliru.stacked-crooked.com/a/6c5042098aaab902 aha, Fascinating. I was also totally wrong about what was going on. – Mooing Duck Aug 30 '23 at 22:00
  • @user4581301, yes, I suspected that it's the wrong ctor, but CPPReference didn't help me out, and I still don't get why. (Besides, my VStudio install is broken, can't start the debugger at the moment.) And I've been in total disbelief all along, and thought I'd understand it in a minute anyway. :) – Sz. Aug 30 '23 at 22:00
  • 3
    [You'll love this one...](https://godbolt.org/z/f5nYfz8q6) – user4581301 Aug 30 '23 at 22:05
  • @Jarod42, ahh, thanks, added! – Sz. Aug 30 '23 at 22:09
  • @user4581301, ahh, a classic, indeed! Believe it or not, I did lose another day for that one too, earlier this year! :) – Sz. Aug 30 '23 at 22:11
  • 1
    @user4581301: Meh, that one's easy. Can't bind a non-const lvalue reference to a temporary. – Ben Voigt Aug 30 '23 at 22:13
  • 2
    @BenVoigt: works also with `std::string` by value [Demo](https://godbolt.org/z/q1d1x6nY4) though ;-) – Jarod42 Aug 30 '23 at 22:21
  • Fascinating to see that there's already a Close vote on this... – Sz. Aug 30 '23 at 22:30
  • 1
    There's been many cornercases to think of when evolving the language when it comes to initialization / initalizer lists etc. I don't think the question deserved a downvote and am a bit surprised it only got one up. – Ted Lyngmo Aug 30 '23 at 22:31
  • Whups. That reference wasn't supposed to be there. that was a typo on my part. – user4581301 Aug 30 '23 at 22:33
  • 11
    One of many trip-wires introduced by "uniform initialization" . IMHO it was a mistake to re-use curly braces for this, and only added to the problems ; especially when it comes to aggregate initialization which is semantically different to non-aggregate, but now cannot be distinguished just from the syntax. – M.M Aug 30 '23 at 22:48
  • Similar (not duplicate) - https://stackoverflow.com/questions/46665914/vector-initialization-with-double-curly-braces-stdstring-vs-int – M.M Aug 30 '23 at 22:58
  • @M.M (re braces etc.) Indeed. And this is a wonderful set of landmines even beyond that (the new range ctors masquerading as aggreg. initializers): the implicit conversions to horrible effect, the red herring from the string ctor, or the lack of even a single warning whatsoever, and the `explicit` keyword mumbling "not my job"... – Sz. Aug 30 '23 at 23:03
  • @M.M The `stl` tag was there for a reason actually, because this isn't just a simple `vector` issue: it comes from the interop. of iterators, ranges and containers in general. I just happened to use a `vector`. The extra tags are welcome, thx. – Sz. Aug 30 '23 at 23:15
  • 2
    Note that the `stl` tag refers to the Standard Template Library, a very influential library that formed the ideological basis (and no small amount of the initial implementations) of the C++ Standard Library back in 1998. Odds are that you are not using the STL today, so removing the tag is justified. `std` would be more appropriate, but probably still too broad since this is about std library containers. Not sure if there's a tag for that, though. – user4581301 Aug 30 '23 at 23:45
  • 2
    Note that `vector v = {"a", "b"};` works correctly. It only breaks without `=`. – HolyBlackCat Aug 31 '23 at 11:11
  • @M.M: Even more similar (one of the examples has the same basic problem): [Initializing vector with double curly braces](https://stackoverflow.com/q/46664728/364696) – ShadowRanger Aug 31 '23 at 13:54
  • @user3840170, is there a way to revert your changes (I can see no `rollback` for your edit)? Because adding `std::` for such a short, simple example, where clarity and brevity is _paramount_, is mere OCD, and lowers the readability of the question for no benefits. For the title, I'm open to some changes (albeit I'm planning to add a summary answer, where Google will be able to easily find everything), but that must be a separate change. – Sz. Aug 31 '23 at 15:09
  • 1
    @Sz.: It's more readable when the line of code you care about tells you it's `std::vector` causing the issue. The changes improved the clarity. Previously there was "action at a distance" from `using namespace std;` (FWIW, removing the `using namespace` also made the code shorter) – Ben Voigt Aug 31 '23 at 15:13
  • @user4581301 I did not realise that the STL was an independent library before being incorporated into the standard. Any chance you know of a resource to take a look at the last major revision before incorporation? The term being used interchangeably with the standard library collections has made it very hard to google. – Chuu Aug 31 '23 at 15:46
  • 1
    @Chuu: Try "SGI Template Library". This may be the latest revision: http://www.stlport.org/doc/sgi_stl.html – Ben Voigt Aug 31 '23 at 16:02
  • @HolyBlackCat `vector v = {"a", "b"};` would still call the same constructor. Try `vector v = {"a", "b", "c"};` and it won't compile – Ted Lyngmo Aug 31 '23 at 16:30
  • @Sz. Here's a funny version: `std::vector v{"ab", "b"};` and compile with optimization. It still has undefined behavior, but will most likely create one `X` (using the `int` constructor) from `'a'`. The reason is that the compiler most likely will just store `"ab\0"` in memory so the string literal `"b"` is just a pointer to `"ab" + 1`. – Ted Lyngmo Aug 31 '23 at 16:32

1 Answers1

21
std::vector<X>{"a", "b"};

This creates a vector from two iterators of type const char* using the constructor that takes two iterators:

template< class InputIt >
constexpr vector( InputIt first, InputIt last,
                  const Allocator& alloc = Allocator() );

Constructs the container with the contents of the range [first, last). This overload participates in overload resolution only if InputIt satisfies LegacyInputIterator, to avoid ambiguity with the overload (3). (below)

constexpr vector( size_type count,
                  const T& value,
                  const Allocator& alloc = Allocator() );

It's just bad luck that the decay of the two const char[]s becomes perfect iterators that fulfills the LegacyInputIterator requirement.

The iterators do not point to an array/contiguous area and the program therefore has undefined behavior.

What happens under the hood is most likely that it'll try to get from the first const char* to the second and run out of bounds as soon as its passing the null terminator after the 'a'.

A similar construction that would actually work:

const char* arr = "working";

struct X {
    explicit X(int i) {
        std::cout << static_cast<char>(i); 
    }
};

const char* first = arr;     // begin iterator
const char* last = arr + 7;  // end iterator

std::vector<X>{first, last}; // prints "working"
Ted Lyngmo
  • 93,841
  • 5
  • 60
  • 108
  • 1
    But why? Why does it even go ahead and nonchalantly create an `X` vector from those (supposedly unrelated) `const char*` iterators? (With zero warnings, too.) – Sz. Aug 30 '23 at 22:05
  • @Sz.: would you expect warning from `std::vector{first, last};`? which one? – Jarod42 Aug 30 '23 at 22:08
  • 3
    @Sz.: From outside `std::vector::vector(begin, end)` the compiler doesn't know the two pointers need to be related. Inside, the compiler doesn't know that they aren't related. – Ben Voigt Aug 30 '23 at 22:12
  • Under the wood, there are probably const data `"a\0b"` so `"b"` happens to be `"a" + 2`. – Jarod42 Aug 30 '23 at 22:12
  • 1
    @Sz. I updated it a bit with an explanation – Ted Lyngmo Aug 30 '23 at 22:14
  • 2
    @Jarod42 there's nothing to prevent the compiler from putting `b` **before** `a`. – Mark Ransom Aug 30 '23 at 22:15
  • Ted, thanks! Could you please add a note about the source of my main confusion: i.e. why the iterator types (which are pretty strictly matched everywhere else in the STL, traditionally) don't seem to help here at all, so I can accept your answer? – Sz. Aug 30 '23 at 22:17
  • 1
    @MarkRansom: links in comment got the surprising result of size of the resulting vector is 2, as OP expects (but not using `std::string` constructor). but we reason about UB anyway here ;-) – Jarod42 Aug 30 '23 at 22:18
  • 2
    @Sz. if those strings are being interpreted as iterators, they're iterators to `char` which probably triggers your `int` constructor. They aren't pointers to `char*` at that point. – Mark Ransom Aug 30 '23 at 22:23
  • 4
    @Sz. It's just unfortunate that the decay of a `const char[]` becomes a perfect iterator that fulfills the _LegacyInputIterator_ requirement. – Ted Lyngmo Aug 30 '23 at 22:23
  • Upvoted for sure, but I'd really like that in the answer, to save some others from this pitfall, possibly, if you don't mind. – Sz. Aug 30 '23 at 22:25
  • 1
    @Sz. Added that to the answer too. – Ted Lyngmo Aug 30 '23 at 22:26
  • 1
    Excellent, thanks! – Sz. Aug 30 '23 at 22:28
  • 2
    Another footnote: I think that the `explicit` for the `int` ctor in the original example is another nicely reinforcing red herring here: it may well be surprising that _even that_ can't stop the conspiracy of those implicit conversions... – Sz. Aug 30 '23 at 22:51
  • 1
    @Sz. Just add `X(char) = delete;` so no one falls into the same pitfall again. – Aykhan Hagverdili Aug 31 '23 at 11:56