0

I would like to compute a hash value of an unordered_map data structure as a whole. This enables to easily compare two maps whether they contain exactly the same key-value-pairs or not.

Obviously, one can iterate over the included pairs, build a long string an hash it afterwards but I can imagine that there are better ways to do this.

For the moment, the actual hash function is not that crucial. I think md5 would be ok. sha too of course.

Are there any suggestions?

lukasl1991
  • 241
  • 3
  • 11
  • 1
    I suggest to combine hash values for all the keys and all the values together. Ideally, in such a way that allows you to update map's hash while map entries are added/removed/updated. Note that equality of hashes doesn't guarantee equality of maps' contents, but it still allows you to quickly find out whether these contents differ. – Daniel Langr May 16 '19 at 06:03
  • Yes, you're right of course. I did not mention hash collisions. – lukasl1991 May 16 '19 at 06:10

2 Answers2

3

You are going to need a commutative combine function, which boost::hash_combine deliberatly isn't, so that equal unordered_map, that have a differing internal order, have equal hashes. For that, I suggest just xoring the hash of each element.

template<typename UnorderedMap>
std::size_t hash(const UnorderedMap & um)
{
    boost::hash<typename UnorderedMap::value_type> elem_hash;
    auto combine = [&](size_t acc, typename UnorderedMap::const_reference elem){ return acc ^ elem_hash(elem); };
    return std::accumulate(um.begin(), um.end(), 0, combine);
}
Caleth
  • 52,200
  • 2
  • 44
  • 75
  • Sorry, but I am not so familiar with templates and lambdas. First of all, the argument of the combine function probably should be named elem? How do I call this function afterwards? Where to locate the template? In the header or source file? And: Does it take both `map->first` and `map->second` into account? – lukasl1991 May 17 '19 at 07:49
  • @lukasl1991 Oops. `combine` is called by accumulate, which is in ``. I checked, and `std::hash` isn't specialised for pairs, but `boost::hash` is. – Caleth May 17 '19 at 08:25
  • You cannot use modern C++ edficiently without being familiar with templates and lambdas. – n. m. could be an AI May 17 '19 at 08:46
  • @Caleth This seems to be what I need. Thank you very much! A final question: My unordered map is wrapped in a class `A` which is splitted in an `A.cpp` and an `A.hpp` file. Where to store the template definition? What is best practice? Even outside both files? – lukasl1991 May 17 '19 at 09:48
  • 1
    The template needs to be visible in every translation unit that wants to hash UnorderedMaps, so in it's own header is a common solution. It could go in `A.cpp` right now, but later you may want to reference it elsewhere – Caleth May 17 '19 at 10:00
1

You can try to use boost's hash_combine() for combining hash of each entries. If you don't want to use boost, you can use xoring of the individual hash values to finally result in a combined hash value.

Additionally you can go through the below answer which describes how you can compare 2 maps without using hash combine:

Link to answer

Saket Sharad
  • 392
  • 2
  • 11