Custom function on Regex replace in Java

Question

I need to replace all words in a text by applying a specific replacement method Modify(). I have the following code snippet in C#:

Regex regex = new Regex("[A-Za-z][a-z]*");
regex.Replace(text, x => Modify(x.Value));

The Modify() function is some function that is executed to modify each match, for example it could replace all the characters in a word with the next alphabetical character. For example, if this is the input text:

Magic banana is eating the apple.

This could be the output:

Nbhjd cbobob jt fbujoh uif bqqmf.

The purpose of the Modify() function is irrelevant here. I am wondering about the Java implementation of the MatchEvaluator. The code is fairly simple in C#, but how would this be achieved in Java?

It is the match for the word. The lambda expression is actually a Match Evaluator (http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.matchevaluator(v=vs.110).aspx). — Igor Ševo, Dec 22 '13 at 18:57
You can use something like `public static String modify(String text){return text.replaceAll("[A-Za-z][a-z]*", "");}` — Raghav, Dec 22 '13 at 18:59
No, I can't. That's not what the lambda operator does here (=>). — Igor Ševo, Dec 22 '13 at 19:12
1) Your regex is flakey. In a global replace, it equals this `[a-zA-Z]+` and 2) What do you do with [zZ], roll it around to [aA] ? — , Dec 22 '13 at 20:11
If this were Perl, you might be able to do this with transliterate. — , Dec 22 '13 at 20:16
1) The regex is fine. It finds all words starting with a small or capital letter. The rest of the letters may not be capital. 2) The purpose of the Modify function was strictly an example. I needed the solution for easily rewrite the code from C# to Java. — Igor Ševo, Dec 23 '13 at 09:37
@Igor Sevo - The assumption for `1)` is not right. In a global replace, `[A-Za-z][a-z]*` will match **all** capital letters wherever they are, beginning/middle/end! Same for lower case. For example, it matches `ABCDE`, 1 upper case each on 5 consecutive matches. — , Dec 24 '13 at 17:01
@Igor Sevo - Its still not correct, and is actually the same. Virtually its just `[A-Za-z]` and in a global replace, will match every single upper/lower case letter. The only way is to use a word boundry: `\b[A-Za-z][a-z]*` — , Dec 25 '13 at 17:53

mrzli · Accepted Answer · 2013-12-22T19:19:24.610

How about something along this lines:

public static void main(String[] args) {
    String text = "Magic banana is eating the apple.";
    System.out.println("Old text: " + text);
    System.out.println("New text: " + getEditedText(text));
}

private static String getEditedText(String text) {
    StringBuffer result = new StringBuffer();
    Pattern pattern = Pattern.compile("[A-Za-z][a-z]*");
    Matcher matcher = pattern.matcher(text);
    while (matcher.find()) {
        matcher.appendReplacement(result, getReplacement(matcher));
    }
    matcher.appendTail(result);
    return result.toString();
}

private static String getReplacement(Matcher matcher) {
    String word = matcher.group(0);
    StringBuilder sb = new StringBuilder();
    for (int i = 0; i < word.length(); i++) {
        char c = word.charAt(i);
        sb.append((char)(c + 1));
    }
    return sb.toString();
}

This is a slightly edited example of the code that can be found at the bottom of this page.

This is the output you would get:

Old text: Magic banana is eating the apple.
New text: Nbhjd cbobob jt fbujoh uif bqqmf.

Custom function on Regex replace in Java

1 Answers1