The following is taken from the official1 OpenJDK 11 source code2.
Starting with the String.replaceAll
method itself.
public String replaceAll(String regex, String replacement) {
return Pattern.compile(regex).matcher(this).replaceAll(replacement);
}
No caching here. Next the Pattern.compile
public static Pattern compile(String regex) {
return new Pattern(regex, 0);
}
No caching there either. And not in the private Pattern
constructor either.
The Pattern
constructor uses an internal compile()
method to do the working of compiling the regex to the internal form. It takes steps to avoid a Pattern
being compiled twice. But as you can see from the above, each replaceAll
call is generating a single use Pattern
object.
So why are you seeing a speedup in those performance figures?
They could be using an old version (before Java 6) of Pattern
that might have3 cached compiled patterns.
The most likely explanation is that this just a JVM warmup effect. A well written benchmark should account for that, but the benchmark that is used in that blog is not doing proper warmup.
In short, the speedup that you think is caused by some "optimization" is apparently just the result of JVM warmup effects such as JIT compilation of the Pattern
, Matcher
and related classes.
1 - The OpenJDK source code for Java 6 onwards is can be downloaded from https://openjdk.java.net/
2 - The OpenJDK 6 source code is doing the same thing: no caching.
3 - I have not checked, but it is moot. Performance benchmarks based on EOL versions of Java are not instructive for current versions of Java. Nobody should still be using Java 5. If they are, performance of replaceAll
is the least of their worries.