2

Trying to get the language name from the language code, I run

function test($local, $fallback)
{
    $bundle = \ResourceBundle::create($local, 'ICUDATA-lang', $fallback);
    if ($bundle === null) {
        return "$local bundle not found";
    }
    $var = $bundle->get('Languages',$fallback);
    return $var->get('fr',$fallback);
}

$locals = ['en', 'en_US', 'foo', 'en_AU', 'en_NZ'];

foreach ($locals as $local) {
    var_dump(test($local, true));
}
echo PHP_EOL;
foreach ($locals as $local) {
    var_dump(test($local, false));
}
string(6) "French"
string(6) "French"
NULL
NULL
NULL

string(6) "French"
string(22) "en_US bundle not found"
string(20) "foo bundle not found"
NULL
NULL

It returns null for Australia and New Zealand that indicates an Intl error of

Cannot load resource element 'fr': U_MISSING_RESOURCE_ERROR"

The third parameter of \ResourceBundle::create functions is for call back that means it should fallback to its parent locale. Interestingly the parent locale of en_AU is en_001.

Is it a bug or I have missed something?

Handsome Nerd
  • 17,114
  • 22
  • 95
  • 173

3 Answers3

0

What you get from:

->get('Languages')->get($lang)

Will never be influenced by the fact that you loaded the repository with or without fallback.

$locals = ['en', 'en_US', 'foo', 'en_AU', 'en_NZ'];

  1. "en", "en_AU and "en_NS" are directly defined in the repository of ICU: so they can all of them be loaded with or without fallback.
  2. "en_US" is not defined, BUT starts with "en_". This means it gives the opportunity to load "en_US", which will fallback on "en". Which is also the case if you use "en_FooBar". If you try with 'bas_il_ic', it would fallback on 'bas'.
  3. "foo" is not defined at all AND cannot fallback, so that means the creation of the ResourceBundle won't fail with fallback = true, however, all you can get from it are NULLs.

This means the fallback argument only permits some sort of strict/non-strict matching of a locale at the time of ResourceBundle::create(), with the understanding that the parent of locale xx_YY is xx. It has nothing to do with the inheritance caused by %%Parent{"xxxxxx"} definitions, which is not following the principle of parent locale defined above, rather a way to share common definitions across locales.

So, the results you have and the documentation are both correct:

With fallback being true:

  • 'en': "French", because locale "en" exists, and "French" is part of en.txt
  • 'en_US': "French", because locale "en_US" not being defined, it falls back on "en", and "French" is part of en.txt
  • 'foo': NULL, because locale "foo" does not exist, nor it can fallback.
  • 'en_AU': NULL, because locale "en_AU" exists, extends "en_001", but none of them defines "fr".
  • 'en_NZ': same as for "en_AU".

With fallback being false:

  • 'en': "French", because locale "en" exists, and "French" is part of en.txt
  • 'en_US': "en_US bundle not found": kind of obvious, "en_US" is indeed not defined.
  • 'foo': "foo bundle not found": kind of obvious, "foo" is indeed not defined.
  • 'en_AU': same explanation as with fallback being true.
  • 'en_NZ': same explanation as with fallback being true.

What you probably ask yourself is rather: "Why ICU language data of locale xx_YY doesn't inherit the one of xx?" and this is valid for all locales, not only the English-based ones.

PHP is consistent with what the ICU library provides, but you may question ICU's internal data.

Patrick Allaert
  • 1,751
  • 18
  • 44
  • Yes, this is what demonstrated in the example. It `fallback` option does not work as documented. And the documented way is the way that can be useful . – Handsome Nerd Feb 25 '20 at 12:25
  • I extended the answer. Option works as documented, but concept of "parent locale" as per naming convention: "en_XX" => "en" is a different thing that the ICU's internal mechanism of inheritance in which, e.g., "en_AU" extends "en_001", but none of them shares thing with "en". So you should clearly question ICU's data definition. PHP has nothing to do here. – Patrick Allaert Feb 25 '20 at 17:39
0

Both ResourceBundle::create and ResourceBundle::get method have fallback parameter. But only the first method actually uses this parameter and get method simply ignores it. This is not what explained in the documentation.

fallback

Whether locale should match exactly or fallback to parent locale is allowed.

Using the Cosmopolitan package, it works as expected.

<?php
require_once "vendor/autoload.php";

use Salarmehr\Cosmopolitan\Intl;

function test($local)
{
    return Cosmo::create($local)->get('ICUDATA-lang','Languages','fr');
}

$locals = ['en', 'en_US', 'foo', 'en_AU', 'en_NZ'];

foreach ($locals as $local) {
    var_dump(test($local));
}

output

string(6) "French"
string(6) "French"
NULL
string(6) "French"
string(6) "French"
Community
  • 1
  • 1
Handsome Nerd
  • 17,114
  • 22
  • 95
  • 173
-1

Background

In the ICU-Data-Directory that you listed on GitHub (note that you were on a release branch rather than master branch), there are neither files for en_US nor en_UK present. On the master branch you can find an en_GB file which seems to be the proper locale code rather than UK.

Though I can not say why the en_US is not around (which I would absolutely expect, just as you did), it appears that with all your tests where you are getting "French" as a result, you are not actually loading the correct ICU path but rather because that one can not be found, the root path is loaded.

You already evidenced that by trying with en_foobar. Same works if you try to load any other nonsense locale that just does not exist in that data directory, and the same is true for both en_US and en_UK (since they do both not exist, as explained before).

As a side note: I think you may have misunderstood the fallback parameter. This is so that if the language can not be loaded, the fallback language is loaded - not that any data you request on it would directly be requested on the parent.

Code examples

To make it a bit more clear what I am talking about, here are some examples.

Loading with nonsense locales always loads the whole directory data so you have access to all the root data:

$bundle = \ResourceBundle::create('stackoverflow-is-great', 'ICUDATA-lang', true);
var_dump($bundle->get('Languages')->get('fr')); // string(6) "French"
var_dump($bundle->get('Languages')->get('de')); // string(6) "German"

Loading sub-languages from en_NZ that do not exist:

$bundle = \ResourceBundle::create('en_NZ', 'ICUDATA-lang', true);
var_dump($bundle->get('Languages')->get('fr')); // NULL
var_dump($bundle->get('Languages')->get('de')); // NULL

Loading sub-languages from en_NZ that do exist:

$bundle = \ResourceBundle::create('en_NZ', 'ICUDATA-lang', true);
var_dump($bundle->get('Languages')->get('mi')); // string(6) "Māori"

To put it in a nutshell: It actually works as expected and gives you data where it is available, and no data where none is available.

Debugging

How did I find that out? Based on this user comment on the PHP ResourceBundle docs page. I added a depth output and had a very handy debug function:

function t($rb, $depth = 0) {
    foreach($rb as $k => $v) {
        echo str_repeat('->', $depth);
        if(is_object($v)) {
            print_r($v);
            var_dump($k);
            t($v, ++$depth);
        } else {
            var_dump($k . " " . $v);
        }
    }
}
$rb = new ResourceBundle('en_UK', 'ICUDATA-lang', true);
var_dump($rb->get('Languages')->get('fr'));

t($rb);

This prints a (really long, which is why I am not adding it here) output which looks suspiciously like the root data.

One last side note: en_GB seems to be aliased to en_001 here as well, I am not sure to what effect though.

TL;DR: First three locales do not really exist in the data set and hence root data is loaded, en_NZ and en_AU work just like they should.

ArSeN
  • 5,133
  • 3
  • 19
  • 26
  • Thanks for your attempt. The question is why the `fallback` parameter does not work. that parameter is explained as "Whether locale should match exactly or fallback to parent locale is allowed." in each local file you have the parent local, and they are in a way that the extend their parent local and overwrite only the values that are different than their parents. So providing a valid locale it should fallback to its parent when the key is not overridden in the child locale. – Handsome Nerd Feb 20 '20 at 04:04
  • That was my point with the first side note (I was not clear that was basically the question so thanks for clearing up): Docs say "Whether locale should match exactly or fallback to parent locale is allowed." From what I can observe this is exactly what happens. What gives you the idea that sub-locales extend(!) the parent? – ArSeN Feb 20 '20 at 10:03