No need to use regex for this, and you certainly shouldn't use split("\\s+")
, since you'd lose consecutive spaces, and the type of whitespace characters, i.e. the spaces of the result could be incorrect.
You also shouldn't use charAt()
or anything like it, since that would not support letters from the Unicode Supplemental Planes, i.e. Unicode characters that are stored in Java strings as surrogate pairs.
Basic logic:
- Locate start of word, i.e. start of string or first character following whitespace.
- Locate end of word, i.e. last character preceding whitespace or end of string.
- Iterating from beginning and end in parallel:
- Skip characters that are not letters.
- Swap the letters.
As Java code, with full Unicode support:
public static String reverseLettersOfWords(String input) {
int[] codePoints = input.codePoints().toArray();
for (int i = 0, start = 0; i <= codePoints.length; i++) {
if (i == codePoints.length || Character.isWhitespace(codePoints[i])) {
for (int end = i - 1; ; start++, end--) {
while (start < end && ! Character.isLetter(codePoints[start]))
start++;
while (start < end && ! Character.isLetter(codePoints[end]))
end--;
if (start >= end)
break;
int tmp = codePoints[start];
codePoints[start] = codePoints[end];
codePoints[end] = tmp;
}
start = i + 1;
}
}
return new String(codePoints, 0, codePoints.length);
}
Test
System.out.println(reverseLettersOfWords("Hello"));
System.out.println(reverseLettersOfWords("Hello!"));
System.out.println(reverseLettersOfWords("I'm saying Hello!"));
System.out.println(reverseLettersOfWords("I haven't said hello yet, but I will."));
System.out.println(reverseLettersOfWords("Works with surrogate pairs: + "));
Output
olleH
olleH!
m'I gniyas olleH!
I tneva'h dias olleh tey, tub I lliw.
skroW htiw etagorrus sriap: +
Note that the special letters at the end are the first 4 shown here in column "Script (or Calligraphy)", "Bold", e.g. the is Unicode Character 'MATHEMATICAL BOLD SCRIPT CAPITAL A' (U+1D4D0), which in Java is two characters "\uD835\uDCD0"
.
UPDATE
The above implementation is optimized for reversing the letters of the word. To apply an arbitrary operation to mangle the letters of the word, use the following implementation:
public static String mangleLettersOfWords(String input) {
int[] codePoints = input.codePoints().toArray();
for (int i = 0, start = 0; i <= codePoints.length; i++) {
if (i == codePoints.length || Character.isWhitespace(codePoints[i])) {
int wordCodePointLen = 0;
for (int j = start; j < i; j++)
if (Character.isLetter(codePoints[j]))
wordCodePointLen++;
if (wordCodePointLen != 0) {
int[] wordCodePoints = new int[wordCodePointLen];
for (int j = start, k = 0; j < i; j++)
if (Character.isLetter(codePoints[j]))
wordCodePoints[k++] = codePoints[j];
int[] mangledCodePoints = mangleWord(wordCodePoints.clone());
if (mangledCodePoints.length != wordCodePointLen)
throw new IllegalStateException("Mangled word is wrong length: '" + new String(wordCodePoints, 0, wordCodePoints.length) + "' (" + wordCodePointLen + " code points)" +
" vs mangled '" + new String(mangledCodePoints, 0, mangledCodePoints.length) + "' (" + mangledCodePoints.length + " code points)");
for (int j = start, k = 0; j < i; j++)
if (Character.isLetter(codePoints[j]))
codePoints[j] = mangledCodePoints[k++];
}
start = i + 1;
}
}
return new String(codePoints, 0, codePoints.length);
}
private static int[] mangleWord(int[] codePoints) {
return mangleWord(new String(codePoints, 0, codePoints.length)).codePoints().toArray();
}
private static CharSequence mangleWord(String word) {
return new StringBuilder(word).reverse();
}
You can of course replace the hardcoded call to the either mangleWord
method with a call to a passed-in Function<int[], int[]>
or Function<String, ? extends CharSequence>
parameter, if needed.
The result with that implementation of the mangleWord
method(s) is the same as the original implementation, but you can now easily implement a different mangling algorithm.
E.g. to randomize the letters, simply shuffle the codePoints
array:
private static int[] mangleWord(int[] codePoints) {
Random rnd = new Random();
for (int i = codePoints.length - 1; i > 0; i--) {
int j = rnd.nextInt(i + 1);
int tmp = codePoints[j];
codePoints[j] = codePoints[i];
codePoints[i] = tmp;
}
return codePoints;
}
Sample Output
Hlelo
Hlleo!
m'I nsayig oHlel!
I athen'v siad eohll yte, btu I illw.
srWok twih rueoatrsg rpasi: +