81

So far I had always used 'en_UK' for British English. Today I got an error when using it with Zend Framework because the locale is not included in the long list of recognized locales.

Here's just a short extract of that list:

'ee_GH' => true, 'ee_TG' => true, 'ee'    => true, 'el_CY' => true, 'el_GR' => true,
'el'    => true, 'en_AS' => true, 'en_AU' => true, 'en_BE' => true, 'en_BW' => true,
'en_BZ' => true, 'en_CA' => true, 'en_GB' => true, 'en_GU' => true, 'en_HK' => true,
'en_IE' => true, 'en_IN' => true, 'en_JM' => true, 'en_MH' => true, 'en_MP' => true,
'en_MT' => true, 'en_NA' => true, 'en_NZ' => true, 'en_PH' => true, 'en_PK' => true,
'en_SG' => true, 'en_TT' => true, 'en_UM' => true, 'en_US' => true, 'en_VI' => true,
'en_ZA' => true, 'en_ZW' => true, 'en'    => true, 'eo'    => true, 'es_AR' => true,
'es_BO' => true, 'es_CL' => true, 'es_CO' => true, 'es_CR' => true, 'es_DO' => true,

As you can see, there are all kinds of territories with English language and there is also an entry 'en_GB' which I assume stands for Great Britain... but there is no 'en_UK'. Is that just a 'bug' in Zend Framework or is there another reason for that?

markus
  • 40,136
  • 23
  • 97
  • 142

5 Answers5

100

The correct country code is en_GB. Locales use ISO 3166-1 for country codes. The wikipedia writeup includes:

The codes are chosen, according to the ISO 3166/MA, "to reflect the significant, unique component of the country name in order to allow a visual association between country name and country code".[7] For this reason, common components of country names like "Republic", "Kingdom", "United", "Federal" or "Democratic" are normally not used for deriving the code elements. As a consequence, for example, the United Kingdom is officially assigned the alpha-2 code GB rather than UK, based on its official name "United Kingdom of Great Britain and Northern Ireland" (although UK is reserved on the request of the United Kingdom).

jan
  • 2,741
  • 4
  • 35
  • 56
martin clayton
  • 76,436
  • 32
  • 213
  • 198
  • Ah, an even better explanation. – Andreas Sep 03 '11 at 22:38
  • 28
    Then why is the 'US' valid? – qdot Sep 03 '11 at 23:53
  • 2
    @qdot Those appear to be guidelines that were used, rather than hard rules. To me 'US' appears to be a reasonable short code for the States. Speaking as a Brit, I think 'GB' is a better fit than 'UK' which seems very bland - but that's just my opinion. – martin clayton Sep 04 '11 at 00:15
  • 18
    The only other possible short code for the United States would be SA (dropping United) but that's taken -- Saudi Arabia and South Africa fought over it. AM is for Armenia, who can't take AR because of Argentina. So we had to settle for US. – David Schwartz Sep 04 '11 at 04:38
  • 7
    Or you could have settled for AA. – GolezTrol Sep 04 '11 at 09:07
  • 3
    What is the case of Serbia, his ISO code is RS meaning Republic of Serbia (Republika Srbija)? courious... – Felipe Buccioni Jul 18 '13 at 15:07
  • 1
    @GolezTrol You, mate, have made my day for that! Thank you for the much needed laugh. Not that I don't have many laughs in a day - but it was a good one! As for Martin I'm mixed on that one. I've always preferred the UK but I don't know why. I guess it's a style preference type of thing. Anyway I was thinking it was *en_GB* but I could not remember for sure and I wanted to verify it. Ta. – Pryftan Jan 16 '20 at 16:33
  • Weird. So what do they speak in 'and Northern Ireland'? – EML Mar 13 '23 at 19:00
21

It is indeed a common (?) mistake. en_UK is wrong. The ISO country code for the UK is GB, making the language tag en_GB.

GolezTrol
  • 114,394
  • 18
  • 182
  • 210
  • But I see it all over the place! Not even the PHP manual is free of en_UK's. – markus Sep 03 '11 at 22:37
  • 2
    There are no ISO locales. There are [ISO language codes](http://en.wikipedia.org/wiki/ISO_639), [ISO country codes](http://en.wikipedia.org/wiki/ISO_3166), and [IETF language tags](http://en.wikipedia.org/wiki/BCP_47). But I guess you mean [C/POSIX locale](http://en.wikipedia.org/wiki/Locale#POSIX-type_platforms). – mercator Sep 03 '11 at 23:47
  • 1
    I referred to the language tag, which is built up from (in this case) two ISO codes. The actual problem in this case, is indeed that the country codes (ISO 3166-1 alpha-2) do GB instead of UK. UK is reserved (on request of the UK), but is invalid to use. But I wanted my answer to merely point out the concrete error (wrong code vs right code) instead of being a history lesson. Nevertheless, it shouldn't be wrong, so I changed that sentence a little bit. – GolezTrol Sep 04 '11 at 09:05
  • 1
    The language tag would be `en-GB`, the POSIX locale is `en_GB`. Yes, I am nitpicking. But I believe ZF doesn't provide any functions to convert between the two, so you'll likely end up with the *invalid* `lang="en_GB"` in your HTML. – mercator Sep 04 '11 at 11:21
  • 1
    ZF does consider both en-GB and en_GB as legal locale strings. – markus Sep 04 '11 at 11:59
  • @markus But the HTML standards do not consider language tags with underscore as "legal", they must be IETF BCP 47 codes in "lang" attributes of HTML tags. Yes, I'm commenting 8 years later... – ADJenks May 03 '19 at 18:30
7

Imperfect implementations

Starting in 1985, ISO 3166-1 alpha-2 codes have been used in the Domain Name System as country code top-level domains (ccTLDs). The Internet Assigned Numbers Authority currently assigns the ccTLDs mostly following the alpha-2 codes, but with a few exceptions.[9] For example, the United Kingdom, whose alpha-2 code is GB, uses .uk instead of .gb as its ccTLD, as UK is currently exceptionally reserved in ISO 3166-1 on the request of the United Kingdom.

According to wikipedia http://en.wikipedia.org/wiki/ISO_3166-1_alpha-2.

So apparently it's an old thing that has stuck or something like that.

EDIT: martin clayton found a better answer.

Andreas
  • 2,261
  • 1
  • 17
  • 25
5

You cannot simply invent locales! They are already defined. So "en-gb" is British English, whereas "en-us" is American English. If you use "en-uk", you are just making up a code.

As to why the official code is "en-gb" and not "en-uk", I think basically because there is no such thing as United Kingdom English. The language variant is British. Note that the difference between Great Britain and the United Kingdom is:

UK = GB + Northern Ireland

So by naming the variant "en-UK" it's as if you are saying that you want to include the way they speak in Northern Ireland within this category. But why would you want to do that? From a linguistic perspective it makes no sense, as Northern Ireland English is closer to Irish English (en-IE) than British English.

Linguist
  • 79
  • 1
  • 1
  • I think this answer is interesting (from a linguistic perspective :-) Would be interesting to hear the rationale from those who downvoted it. (even though it doesn't add the full background as in the @martin_clayton answer) – Per Lundberg Jul 27 '14 at 18:47
  • 1
    @Per: it's linguistically interesting, it just happens to be wrong because in the context "GB" is the ISO country code for the whole of the United Kingdom (counter-intuitively including NI), not just for England, Wales and Scotland. No doubt there are people in NI who'd prefer the ISO country code to be UK. – Steve Jessop Jan 10 '17 at 11:14
  • 1
    @SteveJessop Interesting. That's probably why the UK uses .uk as their TLD; it would be more ISO-correct to use .gb but it doesn't really reflect the reality on the ground where the country is really "United Kingdom of Great Britain AND Northern Ireland"... – Per Lundberg Jan 10 '17 at 14:51
4

the correct country code in ISO 3166 for the United Kingdom of Great Britain and Northern Ireland is for some reason "GB", so the locale code should be en_GB, and not en_UK as noted here.

en_GB is also used by glibc, gnome, kde, etc

Also i like the fact that many previous bug fixes in opensource projects are related to this like in apache http 1.3 and roundcube project

Mouna Cheikhna
  • 38,870
  • 10
  • 48
  • 69