9

Why in the following the hash function (which returns constant 0) seems not be taking any effect?

Since the hash function is returning constant, I was expecting as output all values to be 3. However, it seems to uniquely map the std::vector values to a unique value, regardless of my hash function being constant.

#include <iostream>
#include <map>
#include <unordered_map>
#include <vector>


// Hash returning always zero.
class TVectorHash {
public:
    std::size_t operator()(const std::vector<int> &p) const {
    return 0;
    }
};

int main ()
{
    std::unordered_map<std::vector<int> ,int, TVectorHash> table;

    std::vector<int> value1({0,1});
    std::vector<int> value2({1,0});
    std::vector<int> value3({1,1});

    table[value1]=1;
    table[value2]=2;
    table[value3]=3;

    std::cout << "\n1=" << table[value1];
    std::cout << "\n2=" << table[value2];
    std::cout << "\n3=" << table[value3];

    return 0;
}

Obtained output:

1=1
2=2
3=3

Expected output:

1=3
2=3
3=3

What am I missing about hash?

rph
  • 901
  • 1
  • 10
  • 26
  • 1
    Do you except your data vanish when the hash became same for different data by accident? – MikeCAT Mar 04 '16 at 08:15
  • I don't expect it to vanish. But I expect the data to be overwritten if the hash function maps different Keys to the same position. – rph Mar 04 '16 at 08:18
  • 1
    How about using `table[your_hash_function(your_data)] = your_data;` where `table` is `std::unordered_map`? – MikeCAT Mar 04 '16 at 08:29
  • In fact, I wanted to do this, but apparently there is no simple way to map a sequence of integers into a unique index. Therefore I was trying to see how the unordered_map (and also map) was doing this. – rph Mar 04 '16 at 08:31
  • @rkioji Who says unordered_map (and also map) was doing this? – user253751 Mar 04 '16 at 11:04
  • @rkioji: _"I expect the data to be overwritten if the hash function maps different Keys to the same position."_ That's simply not what a hash map does. – Lightness Races in Orbit Mar 04 '16 at 15:10

3 Answers3

16

You misunderstood the use of the hash function: it's not used to compare elements. Internally, the map organizes the elements into buckets and the hash function is used to determine the bucket into which the element resides. Comparison of the elements is performed with another template parameter, look at the full declaration of the unordered_map template:

template<
    class Key,
    class T,
    class Hash = std::hash<Key>,
    class KeyEqual = std::equal_to<Key>,
    class Allocator = std::allocator< std::pair<const Key, T> >
> class unordered_map;

The next template parameter after the hasher is the key comparator. To get the behavior you expect, you would have to do something like this:

class TVectorEquals {
public:
    bool operator()(const std::vector<int>& lhs, const std::vector<int>& rhs) const {
        return true;
    }
};

std::unordered_map<std::vector<int> ,int, TVectorHash, TVectorEquals> table;

Now your map will have a single element and all your results will be 3.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
Ionut
  • 6,436
  • 1
  • 17
  • 17
8

A sane hash table implementation should not lose information, even in the presence of hash collisions. There are several techniques that allow the resolution of collisions (usually trading off runtime performance to data integrity). Obviously, std::unordered_map implements it.

See: Hash Collision Resolution

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
Ivan Aksamentov - Drop
  • 12,860
  • 3
  • 34
  • 61
  • So, does it mean that having a constant hash function will actually collapse my hash data structure into a simple linked list (or any other data structure used to solve the collisions)? Is there any way to disable this in c++? – rph Mar 04 '16 at 08:25
  • 3
    @rkioji Same hash != same element. That is how *all* hash maps work *by definition.* If you want a data structure which only stores one of each elements with the same hash, use a different equality comparator to convince the map that elements with the same hash are identical. – Angew is no longer proud of SO Mar 04 '16 at 08:58
  • 1
    @rkioji More like a simple dynamic array. Still, the look-up complexity is O(N) because it uses `operator==` or the equality comparator to search the array sequentially. – juanchopanza Mar 04 '16 at 09:46
3

Add a predicate key comparer class.

class TComparer {
public:
    bool operator()(const std::vector<int> &a, const std::vector<int> &b) const {
        return true; // this means that all keys are considered equal
    }
};

Use it like this:

std::unordered_map<std::vector<int> ,int, TVectorHash, TComparer> table;

Then the rest of your code will work as expected.

Ivan Gritsenko
  • 4,166
  • 2
  • 20
  • 34