As described in cppreference.com
The probability of
h(a)==h(b)
fora!=b
should approach1.0/std::numeric_limits<std::size_t>::max()
.
I want to create a hash table of pairs (a, b)
, where (a, b) == (b, a)
(unordered pair), so my hash function is:
struct hash_pair {
template<class T>
std::size_t operator()(std::pair<T, T> const& p) const
{
std::hash<T> h;
return std::hash<std::size_t>(h(p.first) + h(p.second));
}
};
Assuming that h(ti)
and std::hash<std::size_t>
fulfill the requirement, will hash_pair
fulfill it as well?
After further thinking:
(some extra details)
p.first != p.second
by precondition of my use case.T
will bestd::size_t
in the majority of the cases, whose hash value is itself, soh(n) == n
and thusP(n1 == n2)
whenn1 != n2
is0
.- Since the sum is commutative,
hash(pair(n1, n2)) == hash(pair(n2, n1))
, which is intented.
So we have got only two cases where two pairs can be different, when they have only one element in common, or when there have none:
P1 = P(n1 + n2 == n1 + n3) = P(n2 == n3) = 0 // Because n2 != n3
P2 = P(n1 + n2 == n3 + n4) = ? // n1 != n3 and n2 != n4
So my problem is reduced to calculate P(none_in_common) * P(n1 + n2 == n3 + n4)
. P(none_in_common)
is use case specific (this probability will probably be high in my case), but, what about P2
? Any help here?
NOTE: My question is not a duplicate of other similars questions around here because I'm asking about the statistical properties of my proposed hash function, not about how to do it.