0

I just want to use a unordered_map with my struct as key, since I dont need any ordering..but I just cant find myself with all that hash stuff..

As a side relevant question..When ppl compare unordered and ordered map they never talk about the hash function, how can that be? Cant a bad hash function makes unordered map slower than map? (solely due the hash function)

struct exemple{

  unsigned char a,b,c;
  unsigned int n;

  bool operator == ( const exemple & other) const {..}
};

namespace std {
template <>
struct hash<exemple> : public std::unary_function<const exemple &, std::size_t>
{
    inline std::size_t operator()(const exemple & exemple_p ) const
    {
        return 0;// what do I do
    }
};

}

-edit- a,b,c can have only the values 'a', 'b', 'c' or 'd', and n varies ~ 3 to 60.

Icebone1000
  • 1,231
  • 4
  • 13
  • 25
  • Are you required to write the hash function yourself? – evanmcdonnal Nov 15 '12 at 00:16
  • These are two distinct questions. Please post one of them at a time. – Fred Foo Nov 15 '12 at 00:16
  • @evanmcdonnal what do you mean? unordered map doesnt compile if I dont provide one. – Icebone1000 Nov 15 '12 at 00:17
  • Yes, but you could use the logic from a hashing library that already exists. As an example, say I have some string hashing function `hash(string)` I could make a function that converts the `int` to a string then concatenates it with the three `chars`, then end the function with `return hash(StringIjustMade);` which is different from actually writing the low level hashing logic yourself. – evanmcdonnal Nov 15 '12 at 00:20
  • @evanmcdonnal would you post an answer showing it? I never used a hash function directly – Icebone1000 Nov 15 '12 at 00:23
  • 1
    First you drop `unary_function`. This stuff is officially useless since like forever. – pmr Nov 15 '12 at 00:25
  • @Icebone1000 I'm not going to because there are already two good solutions posted. – evanmcdonnal Nov 15 '12 at 00:58

4 Answers4

4

What you do in your hash function depends on the values you got, not necessarily so much on their types. If all four data members contain each value evenly distributed, I would combine the two characters into an unsigned long and return the result of xoring the two values:

typedef unsigned long ulong;
return n ^ (ulong(a << 16) | ulong(b << 8) | ulong(c));

It is certainly a hash function. Whether it is one which works well is a different question. You might also combine the result with std::hash<unsigned long>.

Dietmar Kühl
  • 150,225
  • 13
  • 225
  • 380
  • 3
    apropos combining hashes: Why has C++11 ommited `hash_combine`? It's a nice feature of `boost::hash`. – pmr Nov 15 '12 at 00:27
  • oh I see...I should mention then my unsigned chars can only asume a,b,c or d values...and n varies from ~3 to 60 – Icebone1000 Nov 15 '12 at 00:28
  • 1
    Are you saying that you have 43*60 ~= 2^12 == 4096 values? In that case don't bother using a hash map but use an array... – Dietmar Kühl Nov 15 '12 at 00:32
  • 1
    @pmr: The only trace of `hash_combine` I can find is in [n3333](http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3333.html). Put differently: you didn't propose it! Nor did anybody else. – Dietmar Kühl Nov 15 '12 at 00:36
  • std::hash is giving me a conversion error, error C2440: '' : cannot convert from 'unsigned long' to 'std::hash<_Kty>' – Icebone1000 Nov 15 '12 at 01:06
  • 1
    I don't know what you are trying but it should work: `std::hash hasher; unsigned long hash = hasher(value);`. From the looks of it, you try to pass the `unsigned long` as a constructor argument to `std::hash`, i.e., `std::hash(value)`. This, of course, doesn't work. – Dietmar Kühl Nov 15 '12 at 01:15
  • yep, that was exactly what I was doing – Icebone1000 Nov 15 '12 at 01:19
3

Here's a baseline hash function:

unsigned long long h = (n << 24) | (a << 16) | (b << 8) | c;
return std::hash(h);

I.e., just pack the members into an unsigned long long, then offload the work to std::hash. In the common case that int is 32 bits wide and long long is 64 bits, and assuming your chars are not negative, this uses all the information in your objects for the hash.

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
2

Consider your struct as a whole to be a string of bytes (7, to be precise). You may use any acceptably general string hash function upon those 7 bytes. Here is the FNV (Fowler/Noll/Vo) general bit-string hash function applied to your example (within the given hash functor class):

inline std::size_t operator()(const exemple& obj ) const
{
  const unsigned char* p = reinterpret_cast<const unsigned char*>( &obj );
  std::size_t h = 2166136261;

  for (unsigned int i = 0; i < sizeof(obj); ++i)
    h = (h * 16777619) ^ p[i];

  return h;
}

Note how I converted the reference to the exemple structure (obj) to a pointer to const unsigned char so that I could access the bytes of the structure one-by-one, and I treat it as an opaque binary object. Note that sizeof(obj) may actually be 8 rather than 7 depending upon the compiler's padding (which would mean there's a garbage padding byte somewhere in the structure, probably between c and n. If you wanted, you could rewrite the hash function to iterate over a, b, and c and then the bytes of n in order (or any order), which would eliminate the influence of any padding bytes (which may or may not exist) upon the hash of your struct.

Yes, a bad hash function can make unordered_map slower than ordered_map. This isn't always discussed, because generalized, fast algorithms like the FNV hash given above are assumed to be used by those using unordered_map, and in those cases, generally an unordered_map is faster than an ordered_map at the expense of the ability to iterate over the container's elements in order. However, yes, you must being using a good hash function for your data, and usually it's good enough to use one of these well-known hashes. Ultimately, however, every hash function has its weaknesses depending upon the input data's (here, the contents of the exemple structure) distribution.

A good discussion of generalized hashing and example hashing functions can be found at Eternally Confuzzled, including a C-style FNV hash similar to the one which I've given you.

Matthew Hall
  • 605
  • 3
  • 7
1

boost::hash_combine is designed for this purpose:

std::size_t hash = 0;
for (const auto& value : {a, b, c}) {
    boost::hash_combine(hash, value);
}
boost::hash_combine(hash, n);
return hash;
Luke
  • 7,110
  • 6
  • 45
  • 74