0

I know that a good way to prevent duplicates is to use an unordered_set. However, this method does not seem to work when I want to have an unordered_set<vector<string>>. How can I go about doing this? For example, I want to prevent <"a", "b", "c"> from being duplicated in my unordered_set<vector<string>>.

Can this unordered_set<vector<string>> be used outside the defined class as well?

Code:

unordered_set<vector<string>> abc({"apple", "ball", "carrot"});
abc.insert({"apple", "ball", "carrot"});

cout << abc.size() << endl;     //abc.size() should be 1
JJJ
  • 135
  • 8
  • I think i must define a hash myself? No idea how to do it though – JJJ Mar 20 '17 at 15:07
  • 3
    Can you post a very minimal example which adds {"a", "b", "c"} twice, and checks the set's size()? – Kenny Ostrom Mar 20 '17 at 15:08
  • It does not compile because no hash is defined for `unordered_set>` – JJJ Mar 20 '17 at 15:09
  • 1
    How about simply using std::set? –  Mar 20 '17 at 15:10
  • `unordered_set, my_hash_class>` Tried this? – DeiDei Mar 20 '17 at 15:10
  • 4
    Might be a duplicate of this: http://stackoverflow.com/q/29855908/10077 – Fred Larson Mar 20 '17 at 15:11
  • @Erik Alapäa he doesn't even need to define his own comparison operator (unless lexicographical comparison is insufficient for his use case) – Pandatyr Mar 20 '17 at 15:11
  • @Erik vector already has the required comparison operator –  Mar 20 '17 at 15:13
  • @James Tan, maybe you could get an answer instead of lot of comments if you paste your current non-working code... – Roberto Mar 20 '17 at 15:16
  • @JamesTan -- You didn't mention exactly what is the error. Compilation, runtime, etc.? – PaulMcKenzie Mar 20 '17 at 15:18
  • 1
    I got a similar error saying "The C++ Standard doesn't provide a hash for this type." Seems pretty clear. Look up the template parameters to see how to add a hasher. I'd just concatonate all the strings with a rarely used character as the connector. It won't know the difference between {"a+b+c"} and {"a", "b", "c"} if the connector is '+' but maybe you know that '\0' is not used in any of the strings? (although xyz seems likely) – Kenny Ostrom Mar 20 '17 at 15:26
  • @NeilButterworth true, I remove my comment. – Erik Alapää Mar 20 '17 at 15:27
  • This may help http://stackoverflow.com/questions/17016175/c-unordered-map-using-a-custom-class-type-as-the-key – Kenny Ostrom Mar 20 '17 at 15:31
  • @KennyOstrom I thought about concatenating as well, but I need them to be separated eventually, so there would be additional overhead to split the concatenated string back to the vector – JJJ Mar 20 '17 at 15:39
  • No, that was just in the hasher, not the data itself, but nevermind anything I said ... go to the link that has the actual working implementation. – Kenny Ostrom Mar 20 '17 at 15:48
  • Another way would be using std::sort and then std::unique – Petar Petrovic Apr 19 '17 at 08:42

1 Answers1

0

There is a number of ways to get rid of duplicates, building a set out of your objects is one of them. Whether it is going to be std::set or std::unordered_set is up to you to decide, and the decision usually depends on how good of a hash fuction can you come up with.

This in turn requires the knowledge of the domain, e.g. what your vectors of strings represent and what values can they have. if you do come up with a good hash, you can implement it like this:

struct MyHash
{
    std::size_t operator()(std::vector<std::string> const& v) const 
    {
        // your hash code here
        return 0; // return your hash value instead of 0
    }
};

Then you just declare your unordered_set with that hash:

std::unordered_set<std::vector<std::string>, MyHash> abc;

I would say it's a safe bet to just go with a std::set at first though, unless you have a good hash function on your mind.

Ap31
  • 3,244
  • 1
  • 18
  • 25