Questions tagged [uca]

The Unicode Collation Algorithm

The UCA is the Unicode Collation Algorithm

See also:

10 questions
6
votes
1 answer

How does the handling of combining characters in the Unicode Collation Algorithm work?

I maintain an open-source, pure-Python implementation of the Unicode Collation Algorithm called pyuca. While it meets my needs in sorting Ancient Greek text (and seems to meet the needs of many other people), I'm looking to improve its coverage of…
James Tauber
  • 3,386
  • 6
  • 27
  • 37
4
votes
1 answer

Invert Unicode String Collation Keys

I'm have an index which stores text strings for search, both in their original form and their collated form (Collated form is used for searching the index, Original is displayed in the results). The collation is done via the ICU4C implementation,…
scooz
  • 828
  • 2
  • 7
  • 24
3
votes
1 answer

Is there Unicode Collation Algorithm (UCA) code for Delphi?

Collation under the Unicode Technical Standard #10 (UCA), which is a separate thing from being Unicode Compliant, in case you were wondering about that, implies not only ordering/sorting but also comparison, questions of "is string 1 equal to…
Warren P
  • 65,725
  • 40
  • 181
  • 316
3
votes
3 answers

What is the theory behind unicode collation sorting

What is the theory behind unicode sorting? I understand how it works, but I don't understand why they decided on this standard for collation sorting. It seems that when you have two strings to compare, using ucol_strcolliter() for…
user3404884
  • 65
  • 1
  • 10
1
vote
0 answers

Custom MySQL Collation Not Working

My goal is to sort a few numbers the same as a handful of characters. ie: 4 sorts the same as A or a 3 sorts the same as E or e Why isn't this working? I've added the following to /usr/share/mysql/charsets/Index.xml ...
zevlag
  • 234
  • 2
  • 7
0
votes
0 answers

How to treat different cases of a character as different base letter in Unicode Collation Algorithm (UCA)?

UCA defines serveral levels for character comparsion. The case level is swamped by the stronger level and only take effect when the base characters and accents are same for two strings, so how can I treat the cases as base characters in a…
Xin Zhang
  • 21
  • 2
0
votes
0 answers

Adding a UCA Collation to a Unicode Character Set, why it is doesn't work?

In Unicode Locale Data Markup Language(LDML), since version 24, the element and its sub-elements is deprecated. But the MySQL example still uses deprecated element. The collation defined when I added to MySQL Collation with a latest version of the…
0
votes
1 answer

How to make some punctuation characters indexable in MySQL FULLTEXT indexed field

I have a fulltext indexed field with charset utf8mb4 on MySQL 8.0 I need to be able to search for queries like "km/h" or "A-B" but with the current charset definition, slash and dash are defined as punctuation characters and are therefore not…
Julien
  • 1,302
  • 10
  • 23
0
votes
1 answer

icu (uca) support for frisian collation

In frisian the y is and i and sorts just after it, see http://download.mimer.com/pub/developer/charts/frisian.htm. I try to sort data using xquery processor saxonica using frisian language code, or collation rules, see…
0
votes
0 answers

Implementing sample code for unicode collation algorithm

I have the following requirement in my project. I need to sort strings based on order of the characters provided by the client. For example: Order provided by the user:d,a,A,D,z,p,P,Z So if we have some strings like AaP,aAp,PpZ,pPz. After sorting…
starkk92
  • 5,754
  • 9
  • 43
  • 59