1

Given an integer m, a hash function defined on T is a map T -> {0, 1, 2, ..., m - 1}. If k is an element of T and m is a positive integer, we denote hash(k, m) its hashed value.

For simplicity, most hash functions are of the form hash(k, m) = f(k) % m where f is a map from T to the set of integers.

In the case where m = 2^p (which is often used to the modulo m operation is cheap) and T is a set of integers, I have seen many people using f(k) = c * k with c being a prime number.

I understand if you want to choose a function of the form f(k) = c * k, you need to have gcd(c, m) = 1 for every hash table size m. Even though using a prime number fits the bill, c = 1 is also good.

So my question is the following: why do people still use f(k) = prime * k as their hash function? What kind of nice property does it have?

InsideLoop
  • 6,063
  • 2
  • 28
  • 55
  • http://cs.stackexchange.com/questions/11029/why-is-it-best-to-use-a-prime-number-as-a-mod-in-a-hashing-function – Joe Jan 10 '17 at 09:12
  • http://stackoverflow.com/questions/1145217/why-should-hash-functions-use-a-prime-number-modulus – nos Jan 10 '17 at 09:12
  • None of these links answer my question. For instance, in the link given by Joe, they explain why it is a good idea to use a prime for `m`. By the way, I agree with their point of view. But my question is different. – InsideLoop Jan 10 '17 at 10:20
  • Because gcd=1, `(c*k) % (1< – wildplasser Jan 10 '17 at 11:48
  • @wildplasser: But k % (1 << n) spreads uniformly as well. – InsideLoop Jan 10 '17 at 14:04

1 Answers1

0

You don't need it to be prime. One of the most efficient hash functions with provable collision resistance just multiplies with a random number: https://en.wikipedia.org/wiki/Universal_hashing#Avoiding_modular_arithmetic. You do however need it to be odd.

Thomas Ahle
  • 30,774
  • 21
  • 92
  • 114
  • Here `odd` means: relatively prime wrt the table size. – wildplasser Jan 10 '17 at 15:42
  • @wildplasser I suppose, but only because the hash function outputs `2^k` bits for some `k>0`. If we can then take that number modulo some prime, if we need to make it smaller, but usually hash-tables have a size that's a power of two anyway. Using primes for the modulo is usually important, just not for the multiplication. – Thomas Ahle Jan 10 '17 at 15:46