64

Why does this code output 02 in but o2 in or above?

"o2".replaceAll("([oO])([^[0-9-]])", "0$2")
Boann
  • 48,794
  • 16
  • 117
  • 146
Fuyang Liu
  • 1,496
  • 13
  • 26
  • 18
    A simplification of the code that still shows the behaviour : `Pattern.matches("[^[x]]", "x")` returns true with JDK8 and false with JDK9+. – Aaron Mar 01 '19 at 14:44

1 Answers1

67

Most likely due to JDK-6609854 and JDK-8189343 which reported negative nested character classes handling (in your example [^[0-9-]]). This behavior was fixed in 9 and 10, but fix was not backported to 8. The bug for Java 8 is explained as:

In Java, the negation does not apply to anything appearing in nested [brackets]

So [^c] does not match "c", as you would expect.

[^[c]] does match "c". Not what I would expect.

[[^c]] does not match "c"

The same holds true for ranges or property expressions - if they're inside brackets, a negation at an out level does not affect them.

[^a-z] is opposite from [^[a-z]]

Karol Dowbecki
  • 43,645
  • 9
  • 78
  • 111
  • 6
    You can't prove the regex does not match the string at regex101, it does not support character class union. In PCRE, `[^[0-9-]]` matches a char that is not `[`, digit and `-` and then a `]`. – Wiktor Stribiżew Mar 01 '19 at 14:27
  • 1
    @WiktorStribiżew removed, thanks. Would you suggest some other online tool that supports them? – Karol Dowbecki Mar 01 '19 at 14:28
  • 19
    In case it's not obvious -- the OP can fix this inconsistency by changing `[^[0-9-]]` to `[^0-9-]`. – ruakh Mar 01 '19 at 14:41