0

Why providing an hash function without specifing any implementation of reference and also without specifying the algorithm of reference ( md5, sha256, etc etc ) ?

Also there are similar features for data structures such as the C++ standard-compliant std::unordered_map/set/multimap/multiset::hash_function.

So what i don't get is:

  • Why providing such undocumented methods
  • The implementation details are fundamental for a right use of the hash functions, from a programmer standpoint what is the purpose of these functions
  • This function can be linked to a specific algorithm ?
Marc Mutz - mmutz
  • 24,485
  • 12
  • 80
  • 90
user1824407
  • 4,401
  • 4
  • 15
  • 20
  • 3
    std::hash and sha/md5 are different things. std::hash is for generating hash codes for things like data structure lookups. it's not meant to be a cryptographic hash function. – mfanto Dec 24 '12 at 06:46

1 Answers1

3

Why providing such undocumented methods

They are not undocumented.

The implementation details are fundamental for a right use of the hash functions, from a programmer standpoint what is the purpose of these functions

The implementation is unspecified, they are just supposed to be used together with unordered containers. They should be as good a hash function as possible, to effectively distribute elements into buckets. Anything else is unspecified.

Note that user is expected to provide these if using unordered containers with user defined types.

This function can be linked to a specific algorithm ?

Why not?

K-ballo
  • 80,396
  • 20
  • 159
  • 169
  • "They should be as good a hash function as possible" They "should"? really? When I talk about being undocumented i refer to the implementation details by the way, the only thing that usually matters when dealing with hash functions. – user1824407 Dec 24 '12 at 06:11
  • @user1824407: Not really, unordered containers will work just fine even with the worst possible hash function ever... The quality of the has function does not affect behavior, just performance. – K-ballo Dec 24 '12 at 06:13
  • ok, for data structures, with a good logic, you can avoid this problem, but what about std::hash ? if I need to rely on an hash functions for 10 years or more, and it's implementation is basically compiler-specific and it's likely to change over the time: what is the point of that ? – user1824407 Dec 24 '12 at 06:17
  • @user1824407: They are only supposed to be used to distribute elements into buckets, I fail to see the problem with a compiler specific implementation... Why do you think you need to _rely_ on them? – K-ballo Dec 24 '12 at 06:18
  • i build a database that stores stats about files, each file is represented by its own hash, with this approach from the STL I just can't do this in a reliable way. Each platform will code a different hash, each compiler will compute a different hash and during the years this can only get worst beacause the compilers will update their own implementations. – user1824407 Dec 24 '12 at 06:23
  • 1
    @user1824407: Of course you can't! They are **only** supposed to distribute elements into buckets. If you need a hash for something else then create your own implementation over which you have control... – K-ballo Dec 24 '12 at 06:25
  • 1
    I think there's confusion about what he's asking. std::hash and boost::hash are not cryptographic hash functions, or wrappers around SHA or MD5. If you look at the documentation, you'll see they are not meant for crypto applications (http://en.cppreference.com/w/cpp/utility/hash). You need to look at something like OpenSSL. – mfanto Dec 24 '12 at 06:43
  • @mfanto ok, now i get that, but what about boost ? – user1824407 Dec 24 '12 at 06:53
  • @user1824407: Some thing with _Boost_... You may have luck with the unofficial _Boost.Crypto_ that was lying around somewhere in the _Boost Vault_ – K-ballo Dec 24 '12 at 06:54