9

Let's assume I have a most straightforward std::set of all:

std::set<std::string> my_set;

Now, I have a function which accepts const char* and needs to tell me of this string exists or not in the set, implemented in the most straightforward way:

bool exists(const char* str) {
    return my_set.count(str) > 0;
}

Now, this is obvious efficiency loss. A (potential) dynamic memory allocation followed by deallocation happens right here for no reason.

How can we eliminate that? I'd like to compare std::string which I want to be the key type with char*. One way would be to use unique_ptr<char> instead of my key type with custom comparator, but that'd be super awkward.

The problem can be actually generalized to wider case, effectively, "how to invoke comparison with type provided without conversion to key type?"

P.S. I have seen std::string as map key and efficiency of map.find(), but I am not satisfied with the answer, which effectively reiterates that this optimization is not needed, while it is clearly wrong.

SergeyA
  • 61,605
  • 5
  • 78
  • 137
  • Where's the allocation? `set::count` is templated on the argument type, and `string` has comparisons with `char*`. – Sneftel Jun 27 '19 at 18:45
  • @Sneftel nowhere. I am embarassed. – SergeyA Jun 27 '19 at 18:46
  • @SergeyA Not embarrassing at all. This is a legitimate concern and the answer is not obvious. The only problem is that it's based on assumptions that recent changes the language invalidate, but that happens all the time and just means that the question will probably be helpful to other readers. At the very least, I also learned something. A fine question in my opinion. – François Andrieux Jun 27 '19 at 19:06
  • @FrançoisAndrieux the reason why I am embarrassed is because I didn't do my homework before asking the question. But I am glad the question was helpful. – SergeyA Jun 27 '19 at 19:18

1 Answers1

9

You are correct that by default count is going to convert str to a std::string potentially causing a dynamic memory allocation and at least doing an unnecessary copy. Luckily C++14 add overload for count in the form of

template< class K > 
size_type count( const K& x ) const;

That will take any type. To get this overload though you need to have a comparator that defines a member type with the name is_transparent (the type doesn't matter, just that it exists). Instead of having to write one though we can use the new std::less<void> that was also introduced in C++14. This acts as a transparent comparator by providing a templated operator(). That means you just need to change

std::set<std::string> my_set;

to

std::set<std::string, std::less<>> my_set;
// or
std::set<std::string, std::less<void>> my_set;

and then the set will use bool operator<(std::string, const char*) for the comparison and no temporary or copying needs to happen.

NathanOliver
  • 171,901
  • 28
  • 288
  • 402
  • It's bizarre to me that they didn't change the default comparator to `less<>`. How would that even be a compatibility thing? – Sneftel Jun 27 '19 at 19:13
  • I'm not sure if you can specialize `std::less` for your own types, but if you can, changing the default comparator to `std::less` might impact the behavior when those are provided. – François Andrieux Jun 27 '19 at 19:28
  • @Sneftel See: https://stackoverflow.com/questions/54135237/why-c-associative-containers-predicate-not-transparent-by-default/54135395#54135395 – NathanOliver Jun 27 '19 at 19:49