0

I am using the string.replaceFirst() method in order to replace the first instance of <text> with another string. I used the indexOf method to search for both brackets, and then the replaceFirst method. It works perfectly if text is replaced with any string with an alphanumeric character at the end, but fails to replace when I do something like <some string$>. For reference, the method is

public static String substituteWord(String original, String word) {
        int index1 = original.indexOf("<");
        int index2 = original.indexOf(">");
        storyLine = original.replaceFirst(original.substring(index1,index2+1), word);
        return original;
}

The code doesn't look broken, but why does using a dollar sign make this method fail?

PyCoder
  • 11
  • 3
  • The first argument to `replaceFirst` is meant to be a regular expression. If you just use unquoted raw user input, then that's bound to lead to problems since regex syntax will be interpreted, when you don't expect it. Either use `replace` instead of `replaceFirst` (if its behaviour is acceptable) or `Pattern.quote` on the first argument of `replaceFirst`. – Joachim Sauer Feb 09 '21 at 22:51
  • 1
    You probably want to `return storyLine;` – Robert Feb 09 '21 at 22:55
  • You might be interested in these examples: https://www.javamex.com/tutorials/regular_expressions/search_replace.shtml – Neil Coffey Feb 09 '21 at 23:00

1 Answers1

2

Strictly speaking, the first argument to replaceFirst() and replaceAll() is a regular expression, and a dollar sign in the replacement string has a special meaning of 'group x that was matched against (captured by) the regular expression'.

So the solution is to wrap the first argument in Pattern.quote() and the second argument in Matcher.quoteReplacement() to avoid this special behaviour:

String strToReplace = original.substring(index1,index2+1);
storyLine = original.replaceFirst(Pattern.quote(strToReplace), Matcher.quoteReplacement(word));

As an example of when you would want the special behaviour with the dollar sign, consider this example:

str = str.replaceAll("<b>([^<]*)</b>", "<i>$1</i>");

This would take a piece of bold text from some HTML and replace it with 'whatever was inside the bold tags, but in italics instead'. The parentheses () in the regular expression mean 'capture this substring' as group 1, and then the $1 means 'replace with whatever was captured as group 1'.

Neil Coffey
  • 21,615
  • 7
  • 62
  • 83