The Japanese language, I believe, has more than one sort order equivalent to alphabetical order in English.
I believe there's at least one based on pronunciation (I think the kana have used two orders historically) and one based on radical + stroke count. Chinese also has multiple orders with one based on radical/stroke but due to Unicode Han Unification the same character can have a different stroke count for Chinese and Japanese.
Since I believe the standard for sort order in Unicode is the CLDR for the data with the UCA for the algorithm, and the reference implementation is ICU.
Implementations generally lag behind standards and this information is proving hard to track down to canonical sources.
If I set up a collator with the language specifier ja
, which sort order should I expect to be used?
If several are available for Japanese, or are planned to be available at some point, which specifiers should be used for those? For example the specifier for the traditional alphabetical order of Spanish is es-u-co-trad
.