The difference can be demonstrated by the following code:
StringBuilder sb = new StringBuilder();
sb.appendCodePoint(0x12345);
String s = sb.toString();
System.out.println(s.length()); // Prints 2
System.out.println(s.codePoints().count()); // Prints 1
If your string can possibly contain Unicode code points greater than 0xFFFF, then use s.codePoints().count()
for a correct[*] result.
If your string only contains Unicoce code points in the Basic Multilingual Plane (i.e. characters between '\u0000'
and '\uFFFF'
only, i.e. the one you are most likely to use if you don't want to print hieroglyphics or such things) then use s.length()
instead as that performs better (lower CPU and memory usage).
Footnote:
[*] By "correct", I mean a count of what a non-technical human user might consider a "character" rather than what length()
returns, which is the total number of 16-bit Java characters used to represent the Unicode characters in this string using the UTF-16 encoding - which is a technical measure of length that an ordinary user probably isn't concerned with.