It's not difficult to write a program to demonstrate rehashing, but you have to understand a lot about HashMap's internal organization, how objects' hashcodes are generated, how hashcodes are related to HashMap's internal structures, and how this affects iteration order.
Briefly, HashMap consists of an array of buckets (the "table"). Each bucket is a linked list of key-value pairs. Adding a pair whose key hashes to a bucket that's already occupied is added to the end of the linked list for that bucket. The bucket is determined by calling the key's hashCode()
method, XORing it with the its high order 16 bits right-unsigned-shifted by 16 (see source), and then taking the modulus of the table size. Since the table size is always a power of two, this is essentially ANDing with a mask of (tablesize-1). The hash code of an Integer
object is simply its integer value. (source). Finally, the iteration order of a HashMap steps through each bucket sequentially, and also sequentially through the linked list of pairs within each bucket.
After all that, you can see that small integer values will end up in corresponding buckets. For example, Integer.valueOf(0).hashCode()
is 0. It will remain 0 after shift-and-XOR, and modulus any table size will remain 0. Thus, Integer 0 ends up in bucket 0, Integer 1 ends up in bucket 1, and so forth. But don't forget that the bucket is modulo the table size. So if the table size is 8, Integer 8 will end up in bucket 0.
With this information, we can populate a HashMap with Integer keys that will end up in predictable buckets. Let's create a HashMap with a table size of 8 and a default load factor of 0.75, meaning that we can add six mappings before rehashing occurs.
Map<Integer, Integer> map = new HashMap<>(8);
map.put(0, 0);
map.put(8, 8);
map.put(1, 1);
map.put(9, 9);
map.put(2, 2);
map.put(10, 10);
{0=0, 8=8, 1=1, 9=9, 2=2, 10=10}
Printing out the map (essentially, using its toString()
method) iterates the map sequentially as described above. We can see that 0 and 8 end up in the first bucket, 1 and 9 in the second, and 2 and 10 in the third. Now let's add another entry:
map.put(3, 3);
{0=0, 1=1, 2=2, 3=3, 8=8, 9=9, 10=10}
The iteration order changed! Adding the new mapping exceeded the threshold for rehashing, so the table size was doubled to 16. Rehashing was done, this time with a modulus of 16 instead of 8. Whereas 0 and 8 were both in bucket 0 before, now they're in separate buckets, since there are twice as many buckets available. Same with 1/9 and 2/10. The second entry in each bucket with the old table size of 8 now hashes to its own bucket when the table size is 16. You can see this, since the iteration proceeds sequentially through the buckets, and there is now one entry in each bucket.
Of course, I chose the integer values carefully such that collisions occur with the table size of 8 and do not occur with a table size of 16. That lets us see the rehashing very clearly. With more typical objects, the hash codes (and thus the buckets) are harder to predict, so it's harder to see the collisions and what gets shifted around when rehashing occurs.