6

With template functions from <algorithm> you can do things like this

struct foo
{
    int bar, baz;
};

struct bar_less
{
    // compare foo with foo
    bool operator()(const foo& lh, const foo& rh) const
    {
        return lh.bar < rh.bar;
    }
    template<typename T>  // compare some T with foo
    bool operator()(T lh, const foo& rh) const
    {
        return lh < rh.bar;
    }
    template<typename T>  // compare foo with some T
    bool operator()(const foo& lh, T rh) const
    {
        return lh.bar < rh;
    }
};

int main()
{
    foo foos[] = { {1, 2}, {2, 3}, {4, 5} };
    bar_less cmp;
    int bar_value = 2;
    // find element {2, 3} using an int
    auto it = std::lower_bound(begin(foos), end(foos), bar_value, cmp);
    std::cout << it->baz;
}

In std::set methods like find you have to pass an object of type set::key_type which often forces you to create a dummy object.

set<foo> foos;
foo search_dummy = {2,3};  // don't need a full foo object;
auto it = foos.find(search_dummy);

It would be so helpful if one can call just foos.find(2). Is there any reason why find can't be a template, accepting everything that can be passed to the less predicate. And if it is just missing, why isn't it in C++11 (I think it isn't).

Edit

The main question is WHY isn't it possible and if it was posiible, WHY decided the standard not to provide it. A a second question you can propose workarounds :-) (boost::multi_index_container crosses my mind just now, which provides key extraction from value types)

Another Example with a more expensive to construct value type. The key name is part of the type and should not be used as a copy in maps key;

struct Person
{
    std::string name;
    std::string adress;
    std::string phone, email, fax, stackoferflowNickname;
    int age;
    std::vector<Person*> friends;
    std::vector<Relation> relations;
};

struct PersonOrder
{
    // assume that the full name is an unique identifier
    bool operator()(const Person& lh, const Person& rh) const
    {
        return lh.name < rh.name;
    }
};

class PersonRepository
{
public:

    const Person& FindPerson(const std::string& name) const
    {
        Person searchDummy;  // ouch
        searchDummy.name = name;
        return FindPerson(searchDummy);
    }

    const Person& FindPerson(const Person& person) const;

private:
    std::set<Person, PersonOrder> persons_;
    // what i want to avoid
    // std::map<std::string, Person> persons_;
    // Person searchDummyForReuseButNotThreadSafe;

};
hansmaad
  • 18,417
  • 9
  • 53
  • 94
  • 3
    Would it make sense to find an apple amongst a collection of pears? – juanchopanza Jun 16 '12 at 20:48
  • No, but to find an apple in a collection of apples just by describing the color of the apple without describe the whole thing. – hansmaad Jun 16 '12 at 21:00
  • I guess one could argue that if such fine-grained were needed then one could use `std::find` with a suitable predicate. – juanchopanza Jun 16 '12 at 22:31
  • If you needed to search by something other than `foo`, why are you using a `set` and not a `map`? That's the whole point of them being different types: you search based on the key, not the value. You only use `set` if the search key *is* the value (or if you just want a sorted collection of items). – Nicol Bolas Jun 17 '12 at 01:46
  • 1
    @NicolBolas Because it is part of `foo`. If I store `Person`s uniquely indexed by its full name and `Name`is property of `Person` I'd like to store them in a `set` (or to define `map::value_type` to `Person` and something like `map::key_extractor`to `Person::Name`) – hansmaad Jun 17 '12 at 05:49
  • @hansmaad: As I pointed out, this is backwards thinking. You pick the container based on how you want to *search it*, not based on who owns the key. If you want to search based on the value, then you use a `set`. If you want to search it based on some arbitrary key, whether the value happens to know about it or not, then you use a `map`. – Nicol Bolas Jun 17 '12 at 06:20
  • @NicolBolas So what would be your answer? It would be possible, but it appears to be a C++ design descision: Use map instead. (I would accept this as an answer, even if I don't like it:)) As solution for my problem I remembered `multi_index_container` from boost which is available in all my projects. – hansmaad Jun 17 '12 at 06:50
  • @hansmaad: yes, `multi_index_container` really is awesome :) – Matthieu M. Jun 17 '12 at 09:50

5 Answers5

3

std::find_if works on an unsorted range. So you can pass any predicate you want.

std::set<T> always uses the Comparator template argument (std::less<T> by default) to maintain the order of the collection, as well as find elements again.

So if std::set::find was templated, it would have to require that you only pass a predicate that observes the comparator's total ordering.

Then again, std::lower_bound and all the other algorithms that work on sorted ranges already require exactly that, so that would not be a new or surprising requirement.

So, I guess it's just an oversight that there's no find_if() (say) on std::set. Propose it for C++17 :) (EDIT:: EASTL already has this, and they used a far better name than I did: find_as).

That said, you know that you shouldn't use std::set, do you? A sorted vector will be faster in most cases and allows you the flexibility you find lacking in std::set.

EDIT: As Nicol pointed out, there're implementations of this concept in Boost and Loki (as well as elsewhere, I'm sure), but seeing as you can't use their main advantage (the built-in find() method), you would not lose much by using a naked std::vector.

Marc Mutz - mmutz
  • 24,485
  • 12
  • 80
  • 90
  • 1
    If you're going to suggest using a sorted `vector`, the least you could do is point the user in the direction of [Boost.Container's flat sets/maps](http://www.boost.org/doc/libs/1_49_0/doc/html/container/non_standard_containers.html#container.non_standard_containers.flat_xxx), so they don't have to write the code themselves. – Nicol Bolas Jun 17 '12 at 01:48
  • 1
    Also, `find_if` would make no sense in a `set`. It would be linear in execution time because the predicate would not be guaranteed to be based on the sorting algorithm. The log(n) time of `find` only works because the search is based on the sort order. It's not an oversight; if you want to do a linear-time search for something in a `set`, you could just use `std::find_if` on the `set`'s iterators. The member functions are for operations that you can't do *without* being intrusive. Like a log(n) search for an item based on the sort order. – Nicol Bolas Jun 17 '12 at 01:51
  • @NicolBolas: re `boost::container::flat_set`: thanks for the suggestion. That indeed can make the work a bit easier, but it's not like using a sorted vector is rocket-science, particularly when OP can't use `flat_set::find` (for the same reason he can't use `set::find`). – Marc Mutz - mmutz Jun 17 '12 at 07:07
  • @NicolBolas: re `find_if`: As you've correctly pointed out, `set::find` isn't like `std::find` in that it executes in logarithmic time instead of linearly. A `set::find_if`, then, would relate to `set::find` as `std::find_if` does to `std::find`: As an "overload" that accepts a predicate instead of a value (the `_if` suffix necessitated by the lack of support for overloading on concepts), and that's what OP was asking for. In particular, `set::find_if` would execute in logarithmic time, too, because, like `set::find`, it uses binary search, regardless of the `find` name and its connotations. – Marc Mutz - mmutz Jun 17 '12 at 07:18
  • Thx for the link to `find_as`! Seems that this was one of 14 [crazy idea for stdlib c++11](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1870.html#add-new-set-map-lookup-functions). Too bad that it did not get into the standard. The flat set / Loki advice is not relevant to my question, I know about how to choose between binary_search, trees or hash based containers. – hansmaad Jun 17 '12 at 10:50
  • @MarcMutz-mmutz: Giving it a different name doesn't make it any less fragile. If you're not testing by the exact same algorithm used by the set, (and `eastl::set::find_as` makes no guarantees about this) then your search will be broken. It's too fragile. – Nicol Bolas Jun 17 '12 at 17:34
  • @NicolBolas: You also have to make sure you use a consistent order when using `std::lower_bound` on a sorted range. I fail to see how `find_as` is qualitatively any different. – Marc Mutz - mmutz Jun 17 '12 at 19:41
2

The standard states that std::set::find has logarithmic time complexity. In practice this is accomplished by implementing std::set as a binary search tree, with a strict weak ordering comparison used as sorting criteria. Any look-up that didn't satisfy the same strict weak ordering criteria wouldn't satisfy logarithmic complexity. So it is a design decision: if you want logarithmic complexity, use std::set::find. If you can trade complexity for flexibility, use std::find_if on the set.

juanchopanza
  • 223,364
  • 34
  • 402
  • 480
  • I don't need a `find_if`. I want to provide a sorting criteria that can compare foo with int and int with foo. – hansmaad Jun 17 '12 at 06:23
  • @hansmaad The problem is that you need a binary predicate to build the set. This is what is used for lookup, and there's no way around it. If `foo` had an implicit conversion from int (via an implicit single parameter constructor) then you could search using an int, but it isn't clear whether this works for your specific needs. – juanchopanza Jun 17 '12 at 06:30
1

They've provided for what you want, but in a rather different way than you're considering.

Actually, there are two different ways: one is to build a constructor for the contained class that 1) can be used implicitly, and 2) requires only the subset of elements that you really need for comparison. With this in place, you can do a search for foods.find(2);. You will end up creating a temporary object from 2, then finding that temporary object, but it will be a true temporary. Your code won't have to deal with it explicitly (anywhere).

Edit: What I'm talking about here would be creating an instance of the same type as you're storing in the map, but (possibly) leaving any field you're not using as a "key" un-initialized (or initialized to something saying "not present"). For example:

struct X { 
   int a; // will be treated as the key
   std:::string data;
   std::vector<int> more_data;
public:
   X(int a) : a(a) {} // the "key-only" ctor
   X(int a, std::string const &d, std::vector<int> const &m); // the normal ctor
};

std::set<X> s;

if (s.find(2)) { // will use X::X(int) to construct an `X`
    // we've found what we were looking for
}

Yes, when you construct your X (or what I've called X, anyway) with the single-argument constructor, chances are that what you construct won't be usable for anything except searching.

end edit]

The second, for which the library provides more direct support is often a bit simpler: if you're really only using some subset of elements (perhaps only one) for searching, then you can create a std::map instead of std::set. With std::map, searching for an instance of whatever you've specified as the key type is supported explicitly/directly.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • Can you pls explain 1st way more detailed. You mean an implicite ctor `foo(int i):bar(i){}`? This lets me write `find(2)` but does construct a full `foo` (which may be very expensive in other cases) or not? – hansmaad Jun 17 '12 at 06:01
0

key_type is a member type defined in set containers as an alias of Key, which is the first template parameter and the type of the elements stored in the container.

See documentation.

For user-defined types there is no way for the library to know what the key type is. It so happens that for your particular use case the key type is int. If you use a set< int > s; you can call s.find( 2 );. However, you will need to help the compiler out if you want to search a set< foo > and want to pass in only an integer (think how will the set's ordering work between foo and an int).

dirkgently
  • 108,024
  • 16
  • 131
  • 187
0

Because if you want to do std::find(2) you'll have to define how int will compare with foo in addition to the comparison between two foos. But since int and foo are different types, you will need actually two additional functions:

bool operator<(int, foo);
bool operator<(foo, int);

(or some other logical equivalent pair).

Now, if you do that, you are actually defining a bijective function between int and foo and you could as well simply use a std::map<int, foo> and be done.

If you still don't want the std::map but you want the benefits of a sorted search, then the dummy object is actually the easiest way to go.

Granted, the standard std::set could provide a member function, to which you pass a function that receives a foo and return -1, 0, 1 if it is less, equal or greater than the searched one... but that's not the C++ way. And note that even bsearch() takes two arguments of the same type!

rodrigo
  • 94,151
  • 12
  • 143
  • 190