5

In ConcurrentHashMap.putVal() (JDK Version: 11; ConcurrentHashMap.java; line 1010)

final V putVal(K key, V value, boolean onlyIfAbsent) {
   if (key == null || value == null) throw new NullPointerException();
   int hash = spread(key.hashCode());
   int binCount = 0;
   for (Node<K,V>[] tab = table;;) {
       ...
   }
   addCount(1L, binCount);
   return null;
}

Why does it use the variable tab to reference the table? Likewise in ConcurrentHashMap.get() (beginning on line 934)

public V get(Object key) {
    Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
    int h = spread(key.hashCode());
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (e = tabAt(tab, (n - 1) & h)) != null) {
        if ((eh = e.hash) == h) {
            if ((ek = e.key) == key || (ek != null && key.equals(ek)))
                return e.val;
        }
        else if (eh < 0)
            return (p = e.find(h, key)) != null ? p.val : null;
        while ((e = e.next) != null) {
            if (e.hash == h &&
                ((ek = e.key) == key || (ek != null && key.equals(ek))))
                return e.val;
        }
    }
    return null;
}
Boann
  • 48,794
  • 16
  • 117
  • 146

3 Answers3

5

If you use table, the instance it points to can change while working on it, which can lead to undefined behavior or exceptions. Thus, you need to "fixate" it locally and use that local variable.

I assume this is done to prevent undefined behavior if it is, which should not be done, used by two threads at once in write mode*. The instance at which table points to can change even in the not-concurrent HashMap.

An alternative to this would be using the keyword synchronized, but that reduces performance.

* You can read from a HashMap in multiple threads without issue if it is not getting manipulated while multiple threads hold it.

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
ASA
  • 1,911
  • 3
  • 20
  • 37
2

It's easier to see why Java does this in HashMap where the resize() method sets table = newTab. Any method that was reading the table during a resize() operation would have the reference pulled out from under them and reassigned, which would cause unpredictable behavior.

Volatile could ensure a reading method is updated with the latest table; but that is not at all what we want. We want the reading method to continue uninterrupted with the values that were in the table when it began reading.

Synchronized could block reads and writes from happening simultaneously, but with a performance penalty. If we wanted that, we could revert to using Hashtable.

The same basic reasoning applies to ConcurrentHashMap and its more complicated transfer() method which also reassigns the table reference. The reference is copied into a local variable to avoid losing it during reassignment.

jaco0646
  • 15,303
  • 7
  • 59
  • 83
  • HashTable also uses this – James Marva Apr 15 '20 at 14:42
  • ```java public synchronized V get(Object key) { Entry,?> tab[] = table; int hash = key.hashCode(); int index = (hash & 0x7FFFFFFF) % tab.length; for (Entry,?> e = tab[index] ; e != null ; e = e.next) { if ((e.hash == hash) && e.key.equals(key)) { return (V)e.value; } } return null; } ``` – James Marva Apr 15 '20 at 14:43
  • The `rehash()` method in `Hashtable` is in fact _not_ synchronized, so apparently it still has to protect against the same `table` reassignment. – jaco0646 Apr 15 '20 at 15:04
0

For the performance.using local variable in method is more effective than using global variable. let's see th code block.

public class LocalFieldDemo {
  private String[] globalArr = new String[123];
  public void test() {
      String[] localArr = globalArr;
      for (int i = 0; i < 123; i++) {
          System.out.println(localArr[i]);
      }
  }
}

we see the bytecode

public void test();
  Code:
     0: aload_0
     1: getfield      #3                  // Field globalArr:[Ljava/lang/String;
     4: astore_1
     5: iconst_0
     6: istore_2
     7: iload_2
     8: bipush        123
    10: if_icmpge     28
    13: getstatic     #4                  // Field java/lang/System.out:Ljava/io/PrintStream;
    16: aload_1
    17: iload_2
    18: aaload
    19: invokevirtual #5                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
    22: iinc          2, 1
    25: goto          7
    28: return

and see the second code block:

public class InstanceFieldDemo {
  private String[] globalArr = new String[123];
  public void test() {
      for (int i = 0; i < 123; i++) {
          System.out.println(globalArr[i]);
      }
  }
}

the bytecode of this code block

public void test();
  Code:
     0: iconst_0
     1: istore_1
     2: iload_1
     3: bipush        123
     5: if_icmpge     26
     8: getstatic     #4                  // Field java/lang/System.out:Ljava/io/PrintStream;
    11: aload_0
    12: getfield      #3                  // Field globalArr:[Ljava/lang/String;
    15: iload_1
    16: aaload
    17: invokevirtual #5                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
    20: iinc          1, 1
    23: goto          2
    26: return

what the difference happen? we can see, in the second bytecode

11: aload_0
12: getfield      #3                  // Field globalArr:[Ljava/lang/String;

but the first bytecode:

16: aload_1

if we get value by global variable, jvm must take the command aload_0 to get this,and get field globalArr. In the second code, we only use the one command aload to get the value. The second code block has better performance.