6

I was in the process of selecting one of two methods of putting things into an unordered_map:

std::unordered_map<Key, Value> map;
map.emplace(
  std::piecewise_construct,
  std::forward_as_tuple(a),
  std::forward_as_tuple(b, c, d));

vs

std::unordered_map<Key, DifferentValue> map;
auto& value = map[a];
if (value.isDefaultInitialized())
  value = DifferentValue(b, c, d);

I did some experiments to see which one would perform better to find that when inserting unique elements, the behaviour (as in efficiency) was basically equivalent.

However, in the case of inserting duplicate items, and consider that the construction of Value or DifferentValue is not trivial, I was surprised to find is that emplace constructs the object regardless of whether it will insert it or not.

So, the second method seems to win by far in that case since the default constructor just has isDefaultInitialized_(true) in there and not much more.

For emplace, the code seems to be:

... _M_emplace(std::true_type, _Args&&... __args) {
  __node_type* __node = _M_allocate_node(std::forward<_Args>(__args)...);
  const key_type& __k = this->_M_extract()(__node->_M_v);
  ...
  if (__node_type* __p = _M_find_node(__bkt, __k, __code)) {
     _M_deallocate_node(__node);
     return std::make_pair(iterator(__p), false);
  }
  return std::make_pair(_M_insert_unique_node(__bkt, __code, __node), true);
}

So, although I'm going to go with the second method (even if it requires move assignment and move constructors and extra fields), I was wondering is there a good rationale for why emplace creates an object that it later disregards? That is, should it first check if it needs to create the object and early out if it already exists?

(note that for my particular case default initialized items are not considered valid, so the question is really just about emplace)

For the record, I found something under 23.2.4 table 102:

Effects: Inserts a value_type object t constructed with std::forward<Args>(args)...
if and only if there is no element in the container with key equivalent to the
key of t.

which I think would allow for not creating the object.

vmpstr
  • 5,051
  • 2
  • 25
  • 25
  • 2
    At least the _key_ has to be created in order to use the _hash_ and _comparison_ functions. This problem has been addressed in C++14 for `std::map`. For `std::map`, it is possible to lookup a _key_ without constructing a corresponding object. See http://en.cppreference.com/w/cpp/container/map/find. Unfortunately, that's not possible with `std::unordered_map`. – nosid May 16 '14 at 21:25
  • Oh interesting, I completely glazed over that. So in order to get the key it has to construct the object, so it's not so much a choice of order of operations it's just that it has to create it in order to even find out whether the object with that key already exists? – vmpstr May 16 '14 at 21:28
  • Yes, it has to create the _key_ in order to find out whether the object already exists. – nosid May 16 '14 at 21:34

2 Answers2

4

In my opinion, the quoted part from the standard is misleading, because it suggests, that the object is only constructed if there is no matching element in the container. I guess they are trying to state:

Effects: Constructs a value_type object t with std::forward<Args>(args).... Inserts the constructed object t if and only if there is no such element in the container with key equivalent to the key of t.

The reason is: The implementation of the function emplace has to construct t in order to find out if an element with an equivalent key exists, because the implementation has to invoke the hash function and the equals predicate. However, in general they can only be invoked with objects of type value_type, not with tuples used to construct these objects.

In theory, it would be possible to specify an emplace function, that doesn't construct t if there is already an element with an equivalent key. Interestingly, something similar will be added with C++14 for std::map::find. See the following documentation:

There are two overloads, that can be used with arbitrary types, as long as the compare function fulfills some additional requirements. Interestingly enough, there is no such overload for std::unordered_map.

nosid
  • 48,932
  • 13
  • 112
  • 139
  • 1
    The key is hashed, not the value. So how is it necessary to construct a value type object for computing the hash? – haelix Jun 26 '14 at 07:35
  • @haelix: The _value_ consists of the _key_ **and** the _mapped value_. `std::map::emplace` is a variadic member function template, and there is no direct mapping between parameters and the _key_. So, there is no simple way for the implementation to access the _key_ without constructing the _value_. – nosid Jun 26 '14 at 10:47
  • 2
    I thought that `std::piecewise_construct` is precisely for telling which parameters are for the key. Anyway, I am disappointed by this. It seems that the unordered_map has no knowledge that it is in fact a key-to-value mapping - so it is unable to manipulate keys only? – haelix Jun 26 '14 at 14:11
  • I also find it quite annoying that emplace behaves in this way. For me it's a problem because the second insertion creates a call to the destructor which has side effects. I think this could have been easily solved by using piecewise_construct as you say. Simply creating an overload that uses piecewise_construct (like pair's constructor) and using the first tuple to create just a key and then doing whatever hashing and comparisons are needed. The only downside of this approach is that you'll be calling both the key's constructor and it's move constructor if the insertion is successful. – dcmm88 Jul 22 '15 at 23:55
  • @dcmm88: A while ago I actually implemented such an `emplace` function (as well as a corresponding `put` function aka `insert_or_update`) as a proof of concept (for myself). See http://pastebin.com/8nKZLMaC – nosid Jul 23 '15 at 22:32
1

Yes, the first thing, that std::unordered_map::emplace() does, is to create the to-be-emplaced KEY-VALUE-pair in memory, before searching, if an element with the just constructed KEY already exists in the table. If such element is found, emplace() continues by immediately destroying the newly created element again. This is usually NOT, why people use emplace() in the first place, as it is meant to avoid unnecessary object creation!

The reasoning behind the (IMHO) broken design of std::(unordered_)map::emplace() was probably, that an implementation, that creates KEY first and then checks for the KEY's existance, needs to be able to MOVE or COPY that KEY to its final destination in the KEY-VALUE-pair, if the KEY is not found. As emplace() was added to the STL containers specifically to cater for non-copyable non-moveable objects, an implementation of emplace, that depended on a move-/copyable KEY would have been incomplete.

However, 99% of all reasonable KEYs are either copy-constructible or move-constructible or both, so they should be treated seperately from the VALUEs, whose construction might be much more complicated. And with C++17 aka C++1z, the Gods of the language meant it good with us, and added the try_emplace() method: Its arguments are a reference to an already constructed KEY and the parameters required to construct only the corresponding VALUE inplace. try_emplace() searches for the KEY, first. Only, if the KEY is new, a new KEY-VALUE-pair is constructed by copying or moving the KEY and constructing the VALUE in place. Hurray!

Kai Petzke
  • 2,150
  • 21
  • 29