-1

I use multiple groups in a Regex search and replace many parts of a string. I use $1 $2 etc in Android JAVA when using String.replaceFirst.

If I use more than nine groups in my Regex search when trying to reference them in replaceFirst for example $10 , it will replace the first back reference and then prints a literal 0.

Is there anyway I can use a tenth reference? Is there a different way of referencing it?

Example, but I'm trying to use more than nine back references. $10 sees only $1.

String.replaceFirst("(hello)(.*)(this)","$1middle$2");
zeroprobe
  • 580
  • 1
  • 8
  • 19

2 Answers2

0

TL;DR If you experience that $10 is treated as $1 and a 0, then your regex doesn't have 10 capture groups.

The $ back-references in the replacement value is documented in the javadoc for the appendReplacement method:

The replacement string may contain references to subsequences captured during the previous match: Each occurrence of ${name} or $g will be replaced by the result of evaluating the corresponding group(name) or group(g) respectively. For $g, the first number after the $ is always treated as part of the group reference. Subsequent numbers are incorporated into g if they would form a legal group reference. Only the numerals '0' through '9' are considered as potential components of the group reference. If the second group matched the string "foo", for example, then passing the replacement string "$2bar" would cause "foobar" to be appended to the string buffer. A dollar sign ($) may be included as a literal in the replacement string by preceding it with a backslash (\$).

So, let's say we have 11 groups:

System.out.println("ABCDEFGHIJKLMN".replaceFirst("(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)",
                                                 "$11$10$9$3$2$1"));

Here we capture the first 11 characters as individual groups, so e.g. group(1) returns "A" and group(11) returns "K". The input string has 14 characters, so the last 3 (LMN) are not replaced. The result is:

KJICBALMN

If we remove capture group 11 from the regex, then $11 is not a legal group reference, and will be interpreted as $1 and the literal 1:

System.out.println("ABCDEFGHIJKLMN".replaceFirst("(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)",
                                                 "$11$10$9$3$2$1"));

Prints:

A1JICBAKLMN

So, if you experience that $10 is treated as a $1 back-reference and a literal 0, then your regex doesn't have 10 groups.

Andreas
  • 154,647
  • 11
  • 152
  • 247
  • Hi,I tried your example. I'm using Android Studio. The Android JAVA version it uses must not support this. I get this output - A1A0ICBALMN – zeroprobe Nov 09 '17 at 08:33
0

You can also name them with (?<name>...) and then reference them with ${name}.

String.replaceFirst("(?<g1>hello)(?<g2>.*)(?<g3>this)","${g1}middle${g2}");
kichik
  • 33,220
  • 7
  • 94
  • 114