This code is about 3 times faster than the standard String.toUpperCase() function:
public static String toUpperString(String pString) {
if (pString != null) {
char[] retChar = pString.toCharArray();
for (int idx = 0; idx < pString.length(); idx++) {
char c = retChar[idx];
if (c >= 'a' && c <= 'z') {
retChar[idx] = (char) (c & -33);
}
}
return new String(retChar);
} else {
return null;
}
}
Why is it so much faster? What other work is String.toUpperCase() also doing? In other words, are there cases in which this code will not work?
Benchmark results for a random long string (plain text) executed 2,000,000 times:
toUpperString(String) : 3514.339 ms - about 3.5 seconds
String.toUpperCase() : 9705.397 ms - almost 10 seconds
** UPDATE
I have added the "latin" check and used this as benchmark (for those who don't believe me):
public class BenchmarkUpperCase {
public static String[] randomStrings;
public static String nextRandomString() {
SecureRandom random = new SecureRandom();
return new BigInteger(500, random).toString(32);
}
public static String customToUpperString(String pString) {
if (pString != null) {
char[] retChar = pString.toCharArray();
for (int idx = 0; idx < pString.length(); idx++) {
char c = retChar[idx];
if (c >= 'a' && c <= 'z') {
retChar[idx] = (char) (c & -33);
} else if (c >= 192) { // now catering for other than latin...
retChar[idx] = Character.toUpperCase(c);
}
}
return new String(retChar);
} else {
return null;
}
}
public static void main(String... args) {
long timerStart, timePeriod = 0;
randomStrings = new String[1000];
for (int idx = 0; idx < 1000; idx++) {
randomStrings[idx] = nextRandomString();
}
String dummy = null;
for (int count = 1; count <= 5; count++) {
timerStart = System.nanoTime();
for (int idx = 0; idx < 20000000; idx++) {
dummy = randomStrings[idx % 1000].toUpperCase();
}
timePeriod = System.nanoTime() - timerStart;
System.out.println(count + " String.toUpper() : " + (timePeriod / 1000000));
}
for (int count = 1; count <= 5; count++) {
timerStart = System.nanoTime();
for (int idx = 0; idx < 20000000; idx++) {
dummy = customToUpperString(randomStrings[idx % 1000]);
}
timePeriod = System.nanoTime() - timerStart;
System.out.println(count + " customToUpperString() : " + (timePeriod / 1000000));
}
}
}
I get these results:
1 String.toUpper() : 10724
2 String.toUpper() : 10551
3 String.toUpper() : 10551
4 String.toUpper() : 10660
5 String.toUpper() : 10575
1 customToUpperString() : 6687
2 customToUpperString() : 6684
3 customToUpperString() : 6686
4 customToUpperString() : 6693
5 customToUpperString() : 6710
Which is still about 60% faster.