4

To search for s in S (size(S) >= size(s) and return a true/false value), it's better for performance to use StringUtils.contains() of Apache or use Boyer-Moore algorithm implemented and tested well by someone I found?

Thanks

xuongrong
  • 73
  • 1
  • 7

2 Answers2

8

The last time I looked into the Java regex matching code while debugging, the Java 7 regex engine used the Boyer-Moore algorithm for sequences of literal text matches. So the easiest way to find a String using Boyer-Moore is to prepare using p=Pattern.compile(searchString, Pattern.LITERAL) and search using p.matcher(toSearchOn).find(). No third party libraries and no handcrafted work needed. And I believe the JRE classes are tested well…

Holger
  • 285,553
  • 42
  • 434
  • 765
0

Apache Lang uses the Java API's Region Matching for their contains implementation. Hard to say which is faster on the surface. Sounds like an opportunity to build a simple test case and run it both ways and see.

WPrecht
  • 1,340
  • 1
  • 17
  • 29
  • I will try your suggestion. Sorry I'm not reputation enough to vote for your answer. – xuongrong Nov 18 '13 at 16:18
  • If you're going to do performance testing in this style, it's a good idea to use a framework, such as [OpenJDK JMH] http://openjdk.java.net/projects/code-tools/jmh/ – ngreen Jul 15 '14 at 13:24