3

I am reading from many large text files and I have to check if each snippet of text contains a double value or not. The regex code I am currently using is causing my program to run very slowly because in total I am checking 10 billion strings. I know that due to the large number of strings that I am checking, my program is bound to run slowly. But is there a more efficient and quicker way to check if a String is a double value, thus decreasing the runtime of the program? Thanks

if (string[i].matches(".*\\d.*")) {

.....
}

Also, the strings from the text file are read into an array before the I check them, therefore time isn't wasted constantly reading the text file.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
M9A
  • 3,168
  • 14
  • 51
  • 79
  • 1
    Do `matcher.find("\\d")`. Your regex is slow because attempting to run the regex `.*\\d.*` has poor performance - you have two `.*`s that can attempt to take on any length from 0 to N characters. `find` on a regex will look for it starting at every spot. – Patashu Mar 05 '13 at 01:37
  • Unless you need the strings in memory for some other reason, you're making ambiguous gains in performance by reading the strings into memory at the cost of memory usage storing those strings. It might be worth trying to avoid reading the strings into the array before checking. – Jonathan Leffler Mar 05 '13 at 02:05

2 Answers2

4

Use the Pattern and Matcher classes:

public static final Pattern DOUBLE = Pattern.compile("\\d");

...

if (DOUBLE.matcher(string[i]).find()) {
    ...
}
arshajii
  • 127,459
  • 24
  • 238
  • 287
0

This expression

"\\d+\\.\\d+([eE]\\d+)?"

allows 1.1 or 1.1e1 or 1.1E1 formats.

Note that Java allows more eg 1. or 1. or 0x1p1

Evgeniy Dorofeev
  • 133,369
  • 30
  • 199
  • 275