1

I need to replace all words in a text by applying a specific replacement method Modify(). I have the following code snippet in C#:

Regex regex = new Regex("[A-Za-z][a-z]*");
regex.Replace(text, x => Modify(x.Value));

The Modify() function is some function that is executed to modify each match, for example it could replace all the characters in a word with the next alphabetical character. For example, if this is the input text:

Magic banana is eating the apple.

This could be the output:

Nbhjd cbobob jt fbujoh uif bqqmf.

The purpose of the Modify() function is irrelevant here. I am wondering about the Java implementation of the MatchEvaluator. The code is fairly simple in C#, but how would this be achieved in Java?

Igor Ševo
  • 5,459
  • 3
  • 35
  • 80
  • 1
    What does `x` reference to here? – hwnd Dec 22 '13 at 18:55
  • 1
    It is the match for the word. The lambda expression is actually a Match Evaluator (http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.matchevaluator(v=vs.110).aspx). – Igor Ševo Dec 22 '13 at 18:57
  • 1
    You can use something like `public static String modify(String text){return text.replaceAll("[A-Za-z][a-z]*", "");}` – Raghav Dec 22 '13 at 18:59
  • No, I can't. That's not what the lambda operator does here (=>). – Igor Ševo Dec 22 '13 at 19:12
  • 1) Your regex is flakey. In a global replace, it equals this `[a-zA-Z]+` and 2) What do you do with [zZ], roll it around to [aA] ? –  Dec 22 '13 at 20:11
  • If this were Perl, you might be able to do this with transliterate. –  Dec 22 '13 at 20:16
  • 1) The regex is fine. It finds all words starting with a small or capital letter. The rest of the letters may not be capital. 2) The purpose of the Modify function was strictly an example. I needed the solution for easily rewrite the code from C# to Java. – Igor Ševo Dec 23 '13 at 09:37
  • 2
    @Igor Sevo - The assumption for `1)` is not right. In a global replace, `[A-Za-z][a-z]*` will match **all** capital letters wherever they are, beginning/middle/end! Same for lower case. For example, it matches `ABCDE`, 1 upper case each on 5 consecutive matches. –  Dec 24 '13 at 17:01
  • 2
    @Igor Sevo - Its still not correct, and is actually the same. Virtually its just `[A-Za-z]` and in a global replace, will match every single upper/lower case letter. The only way is to use a word boundry: `\b[A-Za-z][a-z]*` –  Dec 25 '13 at 17:53

1 Answers1

6

How about something along this lines:

public static void main(String[] args) {
    String text = "Magic banana is eating the apple.";
    System.out.println("Old text: " + text);
    System.out.println("New text: " + getEditedText(text));
}

private static String getEditedText(String text) {
    StringBuffer result = new StringBuffer();
    Pattern pattern = Pattern.compile("[A-Za-z][a-z]*");
    Matcher matcher = pattern.matcher(text);
    while (matcher.find()) {
        matcher.appendReplacement(result, getReplacement(matcher));
    }
    matcher.appendTail(result);
    return result.toString();
}

private static String getReplacement(Matcher matcher) {
    String word = matcher.group(0);
    StringBuilder sb = new StringBuilder();
    for (int i = 0; i < word.length(); i++) {
        char c = word.charAt(i);
        sb.append((char)(c + 1));
    }
    return sb.toString();
}

This is a slightly edited example of the code that can be found at the bottom of this page.

This is the output you would get:

Old text: Magic banana is eating the apple.
New text: Nbhjd cbobob jt fbujoh uif bqqmf.
mrzli
  • 16,799
  • 4
  • 38
  • 45