116

I've just lost three days of my life tracking down a very strange bug where unordered_map::insert() destroys the variable you insert. This highly non-obvious behaviour occurs in very recent compilers only: I found that clang 3.2-3.4 and GCC 4.8 are the only compilers to demonstrate this "feature".

Here's some reduced code from my main code base which demonstrates the issue:

#include <memory>
#include <unordered_map>
#include <iostream>

int main(void)
{
  std::unordered_map<int, std::shared_ptr<int>> map;
  auto a(std::make_pair(5, std::make_shared<int>(5)));
  std::cout << "a.second is " << a.second.get() << std::endl;
  map.insert(a); // Note we are NOT doing insert(std::move(a))
  std::cout << "a.second is now " << a.second.get() << std::endl;
  return 0;
}

I, like probably most C++ programmers, would expect output to look something like this:

a.second is 0x8c14048
a.second is now 0x8c14048

But with clang 3.2-3.4 and GCC 4.8 I get this instead:

a.second is 0xe03088
a.second is now 0

Which might make no sense, until you examine closely the docs for unordered_map::insert() at http://www.cplusplus.com/reference/unordered_map/unordered_map/insert/ where overload no 2 is:

template <class P> pair<iterator,bool> insert ( P&& val );

Which is a greedy universal reference move overload, consuming anything not matching any of the other overloads, and move constructing it into a value_type. So why did our code above choose this overload, and not the unordered_map::value_type overload as probably most would expect?

The answer stares you in the face: unordered_map::value_type is a pair<const int, std::shared_ptr> and the compiler would correctly think that a pair<int, std::shared_ptr> isn't convertible. Therefore the compiler chooses the move universal reference overload, and that destroys the original, despite the programmer not using std::move() which is the typical convention for indicating you are okay with a variable getting destroyed. Therefore the insert destroying behaviour is in fact correct as per the C++11 standard, and older compilers were incorrect.

You can probably see now why I took three days to diagnose this bug. It was not at all obvious in a large code base where the type being inserted into unordered_map was a typedef defined far away in source code terms, and it never occurred to anyone to check if the typedef was identical to value_type.

So my questions to Stack Overflow:

  1. Why do older compilers not destroy variables inserted like newer compilers? I mean, even GCC 4.7 doesn't do this, and it's pretty standards conforming.

  2. Is this problem widely known, because surely upgrading compilers will cause code which used to work to suddenly stop working?

  3. Did the C++ standards committee intend this behaviour?

  4. How would you suggest that unordered_map::insert() be modified to give better behaviour? I ask this because if there is support here, I intend to submit this behaviour as a N note to WG21 and ask them to implement a better behaviour.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
Niall Douglas
  • 9,212
  • 2
  • 44
  • 54
  • 10
    Just because it uses a universal ref doesn't mean the inserted value is always moved - it _should_ only ever do so for rvalues, which plain `a` is not. It should make a copy. Also, this behaviour totally depends on the stdlib, not the compiler. – Xeo Feb 03 '14 at 05:36
  • Cannot reproduce with VS2012. – Borgleader Feb 03 '14 at 05:36
  • 10
    That seems like a bug in the implementation of the library – David Rodríguez - dribeas Feb 03 '14 at 05:41
  • 5
    "Therefore the insert destroying behaviour is in fact correct as per the C++11 standard, and older compilers were incorrect." Sorry, but you're wrong. What part of the C++ Standard did you get that idea from? BTW cplusplus.com is not official. – Ben Voigt Feb 03 '14 at 06:06
  • 1
    I cannot reproduce this on my system, and I'm using gcc 4.8.2 and `4.9.0 20131223 (experimental)` respectively. Output is `a.second is now 0x2074088 ` (or similar) for me. –  Feb 03 '14 at 06:34
  • 48
    [This was GCC bug 57619](http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57619), a regression in the 4.8 series that was fixed for 4.8.2 in 2013-06. – Casey Feb 03 '14 at 08:07
  • 1
    For future: when checking how Clang works, *please don't use libstdc++* - or at least don't use libstdc++ *exclusively*. Also avoid saying "clang 3.2-3.4" without mentioning that those clangs used libstdc++. It is extremely important when talking about things that seem to be bugs. – Griwes Feb 03 '14 at 16:01
  • "Moved from" would be better terminology than "destroyed" throughout this question. Neither `a` nor `a.second` is in question of having a destructor call, having its lifetime ended, etc. – aschepler Feb 04 '20 at 02:11

2 Answers2

85

As others have pointed out in the comments, the "universal" constructor is not, in fact, supposed to always move from its argument. It's supposed to move if the argument is really an rvalue, and copy if it's an lvalue.

The behaviour, you observe, which always moves, is a bug in libstdc++, which is now fixed according to a comment on the question. For those curious, I took a look at the g++-4.8 headers.

bits/stl_map.h, lines 598-603

  template<typename _Pair, typename = typename
           std::enable_if<std::is_constructible<value_type,
                                                _Pair&&>::value>::type>
    std::pair<iterator, bool>
    insert(_Pair&& __x)
    { return _M_t._M_insert_unique(std::forward<_Pair>(__x)); }

bits/unordered_map.h, lines 365-370

  template<typename _Pair, typename = typename
           std::enable_if<std::is_constructible<value_type,
                                                _Pair&&>::value>::type>
    std::pair<iterator, bool>
    insert(_Pair&& __x)
    { return _M_h.insert(std::move(__x)); }

The latter is incorrectly using std::move where it should be using std::forward.

Brian Bi
  • 111,498
  • 10
  • 176
  • 312
  • 12
    Clang uses libstdc++ by default, GCC's stdlib. – Xeo Feb 03 '14 at 06:00
  • I'm using gcc 4.9 and I'm looking in `libstdc++-v3/include/bits/`. I don't see the same thing. I see `{ return _M_h.insert(std::forward<_Pair>(__x)); }`. It could be different for 4.8, but I haven't checked yet. –  Feb 03 '14 at 06:29
  • Yeah, so I guess they fixed the bug. – Brian Bi Feb 03 '14 at 06:31
  • @Brian Nope, I just checked my system headers. Same thing. 4.8.2 btw. –  Feb 03 '14 at 06:31
  • Mine's 4.8.1, so I guess it was fixed between the two. – Brian Bi Feb 03 '14 at 06:40
  • @Brian Well, ideone uses 4.8.1 as well and doesn't reproduce the bug. coliru does however. Rextester doesn't reproduce the bug either (not sure of the compiler version.) Edit: it uses 4.7.3, making it a good candidate for an "older" compiler. –  Feb 03 '14 at 06:42
  • @BrianBi Is a const ref overload for std::unordered_map::insert() available in the same file? – dhruvbird Feb 03 '14 at 07:13
  • If the C++ standard committee considers *this* code readable, they should revisit their sub-BSc course... – peterh Feb 25 '18 at 22:08
20
template <class P> pair<iterator,bool> insert ( P&& val );

Which is a greedy universal reference move overload, consuming anything not matching any of the other overloads, and move constructing it into a value_type.

That is what some people call universal reference, but really is reference collapsing. In your case, where the argument is an lvalue of type pair<int,shared_ptr<int>> it will not result in the argument being an rvalue reference and it should not move from it.

So why did our code above choose this overload, and not the unordered_map::value_type overload as probably most would expect?

Because you, as many other people before, misinterpreted the value_type in the container. The value_type of *map (whether ordered or a unordered) is pair<const K, T>, which in your case is pair<const int, shared_ptr<int>>. The type not matching eliminates the overload that you might be expecting:

iterator       insert(const_iterator hint, const value_type& obj);
David Rodríguez - dribeas
  • 204,818
  • 23
  • 294
  • 489
  • I still don't get why this new lemma exists, "universal reference", doesn't really mean anything specific, nor it does carry a good name for a thing that is totally not "universal" in practice. It's much better to remember some old collapsing rules that are part of the C++ meta-language behaviour as it was since templates were introduced in the language, plus a new signature from the C++11 standard . What's the benefit of talking about this universal references anyway ? There are already names and things that are weird enough, like `std::move` that doesn't move anything at all. – user2485710 Feb 03 '14 at 05:49
  • 2
    @user2485710 Sometimes, ignorance is bliss, and having the umbrella term "universal reference" on all the reference collapsing and template type deduction tweak that's IMHO pretty unintuitive, plus the requirement to use `std::forward` to make that tweak do the real work... Scott Meyers has done a good job setting pretty straightforward rules for forwarding (the use of universal references). – Mark Garcia Feb 03 '14 at 06:05
  • 1
    In my thinking, "universal reference" is a pattern for declaring function template parameters that can bind to both lvalues and rvalues. "Reference collapsing" is what happens when substituting (deduced or specified) template parameters into definitions, in "universal reference" situations and other contexts. – aschepler Feb 03 '14 at 15:52
  • 2
    @aschepler: *universal reference* is just a fancy name for a subset of reference collapsing. I agree with the *ignorance is bliss*, and also the fact that having a fancy name makes talking about it easier and trendier and that might help propagate the behavior. That being said I am not a huge fan of the name, as it pushes the actual rules to a corner where they don't need to be *known*... that is, until you hit a case outside of what Scott Meyers described. – David Rodríguez - dribeas Feb 03 '14 at 17:01
  • Pointless semantics now, but the distinction I was trying to suggest: Universal reference happens when I am designing and declare a function parameter as deducible-template-parameter `&&`; reference collapsing happens when a compiler instantiates a template. Reference collapsing is the reason universal references work, but my brain doesn't like putting the two terms in the same domain. – aschepler Feb 03 '14 at 21:48
  • @aschepler: I can see that line of reasoning, although you probably don't have any issues with `const`-collapsing, right? `template void f(const T&);` called as `f(1)`, and you would not call that *universal-const* either... But I do understand that this one seems different due to the novelty of `&&`, and that until we are more used, splitting in smaller nibbles help with *chewing* the concepts. – David Rodríguez - dribeas Feb 03 '14 at 21:59