4

A hastable uses some hash function on an object to store.

This hash function essentially calculates the position of the object in the table.

If we use a HashTable or HashMap and the size can not fit more elements then these collections are resized to accomodate more elements.
This means that each stored element must be rehashed to calculate the new position in the new bigger table.

My question is the following(that the above are correct):
I read that String calculates its hashcode by using the characters that it stores and additionally that the hashvalue is stored internally (cached) for best performance since it does not have to be recalculated.

This is the part I don't get.If the hashcode is based on the characters the String stores then how is the position in the hashtable calculated?

Is there some extra logic using the hashcode of String? So the String's hashcode is not actually the hashvalue?

Cratylus
  • 52,998
  • 69
  • 209
  • 339

1 Answers1

2

The hashcode is not changed. Only the position in the internal table is. Open HashMap and see:

static int indexFor(int h, int length) {
    return h & (length-1);
}

The index within the table (array, actually) is determined based on the hash and the size of the array.

So, when "rehashing" happens, it uses the same hashcode, but a different length, which means the element is put in a different bucket.

Bozho
  • 588,226
  • 146
  • 1,060
  • 1,140
  • 1
    And note that, since length is always a power of two, `h & (length - 1)` is equivalent to `h % length` (which is probably how a lot of students learned to address a hash bucket in their data structures course). – erickson Jan 26 '12 at 21:48
  • @erickson:I am sorry.Why is the `&` here same as `%`?I didn't get that. – Cratylus Jan 26 '12 at 21:51
  • 1
    (Actually `%` would give the wrong answer for half of `h`.) @user384706 It's just bit twiddling. Look at a power of two minus one in binary. – Tom Hawtin - tackline Jan 26 '12 at 22:07
  • @user384706 Division by a power of two is the same as shifting bits to the right. So say the `length` is 256, or 2^8. Dividing a hash code by 256 is equivalent to shifting the hash code 8 bits to the right; the 8 low-order bits that are shifted off during this process are the remainder of the division operation. Masking (with a logical AND) the hash code by 255 (`length - 1`) in this case gives us that remainder without actual doing a general-purpose division, which can be more expensive than a bit-wise '&'. – erickson Jan 26 '12 at 22:08
  • If you really want to know about bit twiddling, get a copy of Hacker's Delight: http://www.amazon.com/Hackers-Delight-Henry-S-Warren/dp/0201914654 – Tom Hawtin - tackline Jan 26 '12 at 22:15
  • @erickson:I think I got it.I am only not clear on why is the remainder `length - 1`.I lost that part :( – Cratylus Jan 26 '12 at 22:20
  • @user384706 Let's work a concrete example, with a hash code of 383785 and a hash table length of 256 (in hexadecimal, 0x5DB29 and 0x100). 383785 % 256 = 41, or 0x29. Additionally, observe that 256 - 1 = 255 (0xFF in hex). 0x5DB29 & 0xFF = 0x29 or 41 decimal. This is no coincidence. Because the divisor is our base raised to a power, the remainder is the lower order digits. – erickson Jan 26 '12 at 23:09