1

I have an expression that could be 13.33, 15.66-17.22, 17.33-17.66. I want to be able to figure out if it has a "-" (en-dash) in it because that will change how my code runs. I followed this thread to check for matches in an expression.

My regex to find the en-dash is "-". My regex expression works online as can be seen here, but fails when used in Java. My code is as follows.

Pattern p = Pattern.compile("-");
Matcher m = p.matcher(refVal);
System.out.println(m.find());
if (m.find()){
    //Do stuff  
}

With the entry, 17.33–17.66 ref, the code prints false.

The expected use cases:

Input: 17.33-17.66 reasdfkljasdfjlkadsf
Output: m.find() should be true

Input: 17.33
Output: m.find() should be false

Input: 2-3 five blah foo
Output: m.find() should be true
Community
  • 1
  • 1
intboolstring
  • 6,891
  • 5
  • 30
  • 44

2 Answers2

6

The problem is that in your input string the dash is (150 ascii), while the dash in the pattern is - (45 ascii). Reference

ndnenkov
  • 35,425
  • 9
  • 72
  • 104
3

If you want to check the presence of em-dash (or any single character), you can just use String.contains, you don't need to use regular expressions for that.

refVal.contains("—")

To make sure you're testing for em-dash, you can use it's Unicode code to check:

refVal.contains("\u2014")
Szymon
  • 42,577
  • 16
  • 96
  • 114
  • @intboolstring can you show us your new code with this working? Because I think you're doing something wrong at this point. Unless you're testing with the wrong dash, perhaps, as the other answer suggests. – Ricky Mutschlechner Dec 30 '15 at 06:27
  • Are you sure you're using http://www.fileformat.info/info/unicode/char/2014/index.htm em-dash in both your tested string and contains? – Szymon Dec 30 '15 at 06:27
  • 2
    The original string contains an *en*-dash, not em-dash. See [here](http://r12a.github.io/uniview/?charlist=%E2%80%93%E2%80%94). – Wiktor Stribiżew Dec 30 '15 at 06:32
  • Ok - didn't know the difference. – intboolstring Dec 30 '15 at 06:32