-2

I have started learning Collections. So when we generate hashcode using eclipse below is the formula which is present in the method:

        final int prime = 31;
        int result = 1;
        result = prime * result + ((id == null) ? 0 : id.hashCode());
        result = prime * result + ((pin == null) ? 0 : pin.hashCode());

I have searched and found that since 31 is odd prime we use it while calculating hashcode. Multiplying by prime gives a good distribution of hashcodes.But haven't come across any concrete/layman explaination on why do we use the above formula and why exactly 31 is used. Can someone please help elaborate on how exactly does multiplying by 31 give a better distribution of hashcode?

ghostrider
  • 2,046
  • 3
  • 23
  • 46
  • 1
    The idea is to pick a ***different*** odd prime every time you override `hashCode`, as for why an odd prime; it leads to better distribution of the keys across the hash buckets. Basically, it's an important implementation detail. – Elliott Frisch Feb 26 '20 at 03:09
  • 2
    It's not only a prime, it's a [Mersenne prime](https://en.wikipedia.org/wiki/Mersenne_prime), which means it's 2^5 - 1 = 11111b, which means that the repeated operations will retain more bits of information. It's simply a convenient number that has reasonable average performance. – chrylis -cautiouslyoptimistic- Feb 26 '20 at 03:12
  • @ElliottFrisch thanks, I have edited the question – ghostrider Feb 26 '20 at 03:15
  • @chrylis-onstrike-, awesome!!. could be posted as an answer. – Brooklyn99 Feb 26 '20 at 05:34
  • 1
    Does this answer your question? [Why does Java's hashCode() in String use 31 as a multiplier?](https://stackoverflow.com/questions/299304/why-does-javas-hashcode-in-string-use-31-as-a-multiplier) Simply typing "java hash 31" gives you that one as the first result. – Nicktar Feb 26 '20 at 06:37

1 Answers1

3

from Joshua Bloch, Effective Java, Chapter 3, Item 9

The value 31 was chosen because it is an odd prime. If it were even and the multiplication overflowed, information would be lost, as multiplication by 2 is equivalent to shifting. The advantage of using a prime is less clear, but it is traditional. A nice property of 31 is that the multiplication can be replaced by a shift and a subtraction for better performance: 31 * i == (i << 5) - i. Modern VMs do this sort of optimization automatically.

Some words about multiplication can be replaced by a shift. Multiplying to 2 is pretty easy operations in binary algebra. You just need to shift the number to the left and add 0 to the end. 4*2 = b100 << 1 = b1000 = 8. If a factor is a power of 2, you need to shift the binary number by the power value. 4*8 = 4 * 2^3 = b100 << 3 = b100000 = 32.

Also the same logic works for dividing: 8/4 = 8/2^2 = b1000 >> 2 = b10 = 2

Maxim Popov
  • 1,167
  • 5
  • 9