What is the theory behind unicode sorting? I understand how it works, but I don't understand why they decided on this standard for collation sorting.
It seems that when you have two strings to compare, using ucol_strcolliter() for example:
ucol_strcollIter(collator, &stringIter1, &stringIter2, &Status)
Then, say I the two strings are:
string string1 = "hello"
string string2 = "héllo"
Under the "Secondary" collation strength, string1 should be ordered before string2. Where string1 and string2 are compared on their secondary strength.
<1 hello
<2 héllo
BUT
If you have trailing spaces, like:
string string1 = "hello "
string string2 = "héllo "
then the accented hello (string2) will be placed before string1. And, both are compared on their primary weight.
<1 héllo
<1 hello
Why does the unicode collation algorithm take into account the trailing spaces?
Is there some reason behind this?