I need to sort an array of strings, where elements are compared lexicographically as sequences of code point values, so that, for example, "Z" < "a" < "\udabc" < "�" < ""
.
- Is there a more efficient way of comparing strings, other than manually iterating over both of them and comparing the corresponding code points?
- What if it is guaranteed that the strings don't have any surrogate code points (but may have surrogate pairs, so
"�" < ""
should still hold)? Is there a more efficient procedure for this special case?
Note: there are many answers on StackOverflow explaining how to sort strings, but they either use the localeCompare
order or the order defined by JavaScript comparison operators (which compare strings as sequences of UTF-16 code units). I am not interested in either of those.