Recently I'm learning Hotspot JVM. When learning the string constant pool and String intern function, I encountered a very weird situation. After browsing a lot of answers, I still can’t explain this phenomenon, so I’m sending it out to discuss with you.
public static void main(String[] args) {
String s1 = new String("12") + new String("21");
s1.intern();
String s2 = "1221";
System.out.println(s1 == s2); // true
}
public static void main(String[] args) {
String s1 = new String("12") + new String("21");
// s1.intern();
String s2 = "1221";
System.out.println(s1 == s2); // false
}
The reslut is based on Java8.
So the only difference between the two codes is call s1.intern() or not.
Here is the document of intern function.
When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.
Here is my understanding:
- By browsing the bytecode file, we can find "12", "21", "1221" in the constant pool.
- When the class is loaded, the constant pool in bytecode file is loaded into run-time constant pool. So the String pool contains "12", "21", "1221".
- new String("12") create a String instance on the heap, which is different from "12" in String pool. So does new String("21").
- The "+" operator is transformed into StringBuilder and call its append and toString method, which can be seen in bytecode.
- In toString method calls new string, so s1 is String instance "1221" on the heap.
- s1.intern() look into String pool, and a "1221" is there, so it dose nothing. Btw, we don't use the return value, so it has nothing to do with s1.
- String s2 = "1221" just loaded the "1221" instance in the string pool. In bytecode, ldc #11, #11 is the index of "1221" in constant pool.
- The "==" operator comapre the address of reference type. The s1 point to the instance on the heap, the s2 point to the instance in the string pool. How can these two be equal?
My wonder:
- What exactly do s1 and s2 point to?
- Why call intern() methed will change the behavior? Even don't use the return value.
Here is my assumption:
The string pool is not initilized when class is loaded. Some answer said s1.intern() is the first time "1221" is loaded into string pool. But how to explain "1221" is in the constant pool of bytecode file. Is there any specification about string pool loading timing?
Another saying is intern function just save the reference to the instance on the heap, but the renference s1, s2 are still different. s1 point the heap, s2 point to the string pool, and string pool point to the heap. The reference is different from reference of a reference.