0

In Chapter 5.1.4, Modern Compiler Implementation in Java:

public class Symbol {
  private String name;
  private Symbol (String n) { name = n; }
  private static java.util.Dictionary dict = new java.util.Hashtable();

  public String toString() { return name; }

  public static Symbol symbol(String n) {
    String u = n.intern();
    Symbol s = (Symbol)dict.get(u);
    if (s == null) { s = new Symbol(u); dict.put(u, s); }
    return s;
  }
}

I can't see why using string intern here, since Hashtable use key.equals(...) to check identity.

Could you please tell me the reason? Thanks!

abcdabcd987
  • 2,043
  • 2
  • 23
  • 34

2 Answers2

1

In programming there is a lot of "wisdom", "rumours", "magic" or "superstition" going around.

As @RealSkepic points out, Before Java 7u4, String.substring would use a portion of the original string rather than copy that portion. While this improved performance in many cases, it could lead to memory leaks. Using intern() was one way to avoid this, though it could create it's own memory clean up problems which is not ideal. Using new String(oldString) was another approach, but you shouldn't need to do that now.

People often try things for "performance reasons" but don't know how to test it, or just don't check it actually helps. I do this from time to time, even though I know to avoid it, because it is far too often incorrect, or just makes the code confusing.

Most likely the author found a situation, or heard some one saved a lot of memory by using String.intern() and in specific cases, it can do this, but it's not like fairy dust where you sprinkle a bit of performance magic and everything is better. Most of these obscure tricks to optimise code only work in very specific use cases.

A similar example is when people using locks or thread safe collections in multi-threading. Sprinkle enough of this around and the program can appear to stop an error, but you haven't really fixed the problem, just made it harder to find when something incidental changes and your bug shows itself again.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • 1
    I'm not entirely sure. I think the class is probably part of a whole compilation framework. You'd be using it by parsing the data out of some string. It is also very old (no generics), and therefore, comes from the days `substring` was using the original `String`'s entire backing array. They could be worried about keeping the original huge array around, or the strings used for the symbols may be kept in a separate data structure as well where they may be repeated several times. – RealSkeptic Mar 30 '16 at 08:14
1

I hope you know what String#intern does. Simply put, it will add the given string into pool of strings maintained by the String class if it already is not part of it or if the string is already part of the String pool, that object is returned. So, there will be only copy of this particular value of String in the string pool.

What it means is when we do aString.intern(), and this is always put in the Map, the next time when anotherString.intern() is get from the map, the equals will return true in the == comparison itself. This will avoid iterating through the entire string to verify the equality. This could prove to be a great performance improvement if the Strings stored in the map could be large and if the Map is going to be searched (get or contains operations) frequently.

Amudhan
  • 696
  • 8
  • 18