0

So I have an array with domain names

Array () {
    [0] => Array () {
        [id]=>int
        [name]=>string
        [something]=>string/int...
    }
    [38] => Array () {

    }
}

I need to show the domain names in a table, I get them ordered by name ASC from the mysql table, loop trough the array and then convert the $domain['name'] to show the IDNs not like xn--sfasdf-dfg.com, but human readable. But this domain names go between W and Y, because they begin with X, what I want is to sort them by their unicode name някво-име.com (it starts with N) this domain name should be between M and O, not between W and Y. Does somebody knows how this can be done? I have searched in Google, but it seems nobody have wrote about such problem. I try usort and uasort php functions, but the order was completely wrong.

Thanks, Yani Krystev

Lanexbg
  • 1,071
  • 1
  • 12
  • 17

2 Answers2

0

Sorting (or collating for the technical term) is a complicated matter, already when inside one language but even more so if you mix languages. There is no context associated with a domain name string, you do not know in which language it is, you just have a string with characters (even between all languages using only ASCII the sorting order is not the same).

So you will first need to sort this out and define in which language you want to operate. Then you have tools to do the collating, but they are not basic tool.

For example, in libc the LC_COLLATE variable will influence all tools correctly programmed for internationalisation, and thus taking it into account.

libicu is another big library handling things like that, see https://www-01.ibm.com/software/globalization/icu/

PHP has the intl extension that is a wrapper around it. See specifically the Collator class at http://php.net/manual/en/class.collator.php :

Provides string comparison capability with support for appropriate locale-sensitive sort orderings.

Patrick Mevzek
  • 10,995
  • 16
  • 38
  • 54
-1

някво-име.com does not start with "N", it starts with cyrillic letter "AN" which sounds like english "N". If you want to sort according to how letter sounds, you going to have a tough times. If you want to sort by letter, it's a different alfabet, and should be either after Z or before A.

I'd decode all the domains to unicode, convert english domains to unicode too, and just sort the unicode list using any normal built-in function for unicode sorting.

Sandman4
  • 2,673
  • 2
  • 22
  • 18