8

I need to extract the first integer found in a java.lang.String and am unsure as to whether to try and use a substring approach or a regex approach:

// Want to extract the 510 into an int.
String extract = "PowerFactor510";

// Either:
int num = Integer.valueof(extract.substring(???));

// Or a regex solution, something like:
String regex = "\\d+";
Matcher matcher = new Matcher(regex);
int num = matcher.find(extract);

So I ask:

  • Which type of solution is more appropriate here, and why?; and
  • If the substring approach is more appropriate, what could I use to indicate the beginning of a number?
  • Else, if the regex is the appropriate solution, what is the regex/pattern/matcher/method I should use to extract the number?

Note: The string will always begin with the word PowerFactor followed by a non-negative integer. Thanks in advance!

CloudyMarble
  • 36,908
  • 70
  • 97
  • 130
IAmYourFaja
  • 55,468
  • 181
  • 466
  • 756
  • 1
    Regex would be more advisable due to faster processing. – Kumar Shorav Mar 25 '13 at 13:25
  • 3
    Is regex really faster than `substring(11)`? The first part is always fixed... I don't think that parsing a regex, going through the string and extracting the appropriate group would be quicker than to just chop off the first 11 chars... – ppeterka Mar 25 '13 at 13:26
  • http://docs.oracle.com/javase/6/docs/api/java/lang/String.html#substring(int) – Kent Mar 25 '13 at 13:28

2 Answers2

9

The string will always begin with the word "PowerFactor" followed by a non-negative integer

This means you know exactly at which index you will find the number, i would say you better use the substring directly, at least considering the performance it would be much faster than searching and matching work.

extract.substring("PowerFactor".length());

I could not find any direct comparision but you can read about each one of the two options:

Community
  • 1
  • 1
CloudyMarble
  • 36,908
  • 70
  • 97
  • 130
1

Was a bit curious and tried the following

String extract = "PowerFactor510";
long l = System.currentTimeMillis();
System.out.println(extract.replaceAll("\\D", ""));
System.out.println(System.currentTimeMillis() - l);

System.out.println();

l = System.currentTimeMillis();
System.out.println(extract.substring("PowerFactor".length()));
System.out.println(System.currentTimeMillis() - l);

And it tuned out that the second test was much faster, so substring wins.

tmwanik
  • 1,643
  • 14
  • 20
  • Why in the world would you put `\D` in brackets there? – tchrist Mar 25 '13 at 15:05
  • @tchrist Editted the answer – tmwanik Mar 25 '13 at 15:08
  • 3
    That is a horrible test. the replaceAll method of the String class performs an inline compile on the RegEx before processing it. The method does not yield a suitable test against Pattern/Matcher or anything RegEx related. Speed differences you are seeing are related to object creation and GC in the JVM. – ingyhere Apr 29 '14 at 22:18