TL;DR If you experience that $10
is treated as $1
and a 0
, then your regex doesn't have 10 capture groups.
The $
back-references in the replacement value is documented in the javadoc for the appendReplacement
method:
The replacement string may contain references to subsequences captured during the previous match: Each occurrence of ${name}
or $g
will be replaced by the result of evaluating the corresponding group(name)
or group(g)
respectively. For $g
, the first number after the $
is always treated as part of the group reference. Subsequent numbers are incorporated into g
if they would form a legal group reference. Only the numerals '0'
through '9'
are considered as potential components of the group reference. If the second group matched the string "foo"
, for example, then passing the replacement string "$2bar"
would cause "foobar"
to be appended to the string buffer. A dollar sign ($
) may be included as a literal in the replacement string by preceding it with a backslash (\$
).
So, let's say we have 11 groups:
System.out.println("ABCDEFGHIJKLMN".replaceFirst("(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)",
"$11$10$9$3$2$1"));
Here we capture the first 11 characters as individual groups, so e.g. group(1)
returns "A"
and group(11)
returns "K"
. The input string has 14 characters, so the last 3 (LMN
) are not replaced. The result is:
KJICBALMN
If we remove capture group 11 from the regex, then $11
is not a legal group reference, and will be interpreted as $1
and the literal 1
:
System.out.println("ABCDEFGHIJKLMN".replaceFirst("(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)",
"$11$10$9$3$2$1"));
Prints:
A1JICBAKLMN
So, if you experience that $10
is treated as a $1
back-reference and a literal 0
, then your regex doesn't have 10 groups.