4

In the Java 7 HashMap implementation, in the resize method, it calls transfer which moves old elements to a new table. Why have they written a new method instead of calling put with all the old elements? The resize won't be triggered again due to threshold. Calling put makes the code clearer.

/**
 * Transfers all entries from current table to newTable.
 */
void transfer(Entry[] newTable) {
    Entry[] src = table;
    int newCapacity = newTable.length;
    for (int j = 0; j < src.length; j++) {
        Entry<K,V> e = src[j];
        if (e != null) {
            src[j] = null;
            do {
                Entry<K,V> next = e.next;
                int i = indexFor(e.hash, newCapacity);
                e.next = newTable[i];
                newTable[i] = e;
                e = next;
            } while (e != null);
        }
    }
}
Henry Twist
  • 5,666
  • 3
  • 19
  • 44
Runpeng Chen
  • 105
  • 8
  • 1
    I think this probably didn't deserve to be closed. Although technically this could be opinion-based, there is a significant difference between using `put` and `transfer` (as @JoachimSauer pointed out), which would seem to answer the question perfectly. – Henry Twist Jun 17 '21 at 13:03
  • @HenryTwist I vote to reopen. It is not just opinion based. As JoachimSauer mentioned, the developer consider performance over code cleanness. I will consider GC time and `hashCode` as fact instead of opinions. – Runpeng Chen Jun 17 '21 at 18:08

1 Answers1

5

One important difference is that transfer can make use of the fact that Entry objects already exist for each entry in two ways:

  • it can reuse the Entry objects themselves to avoid having to allocate new ones (thus avoiding memory allocation and therefore reducing GC pressure).
  • it can reuse the hash value stored in the Entry object thus avoiding having to call Object.hashValue on each key that's already in the map (which could theoretically be an expensive operation).

Basically: if resize was implemented simply in terms of put it would have to re-do a lot of work that can easily be avoided.

Later versions of the JDK have significantly more complex HashMap implementations, where even more complex transfer methods (or equivalent) are implemented.

It's also worth pointing out that even minor performance gains at the cost of less "simple" code are often worth it when done in the JDK itself: since HashMap is used in basically every Java program out there, having it be as performant as possible at the cost of slightly less readable cost is usually worthwhile trade-off. That same reasoning does not apply equally to most other software that us "mere mortals" write.

Joachim Sauer
  • 302,674
  • 57
  • 556
  • 614