7

In support of software internationalization, many programming languages and platforms support a means of obtaining localized resources to be used in the UI that is shown to the user (e.g. Java's java.util.ResourceBundle class). Often, if resources for the user's preferred locale are not available, then there is a fallback mechanism, or locale resolution process, that will attempt to locate the nearest-matching resources from the sets of available resources. For example, if resources for en-US are not available, then commonly the system attempts to find resources for en.

The locale resolution process seems nearly the same for many languages' and platforms' resource bundle solutions. Are they following some standard locale resolution algorithm, or, if not, does such a standard exist?

j0k
  • 22,600
  • 28
  • 79
  • 90
Daniel Trebbien
  • 38,421
  • 18
  • 121
  • 193
  • 1
    They (i18n professionals who design such features) follow best practices. Best practices will be more or less obvious when you know something about territories (~countries) and languages. Easy fall back mechanism described by Tom was part of Java up to the version 6. Now with Java 7 and BCP 47 is way more complicated - see Chinese languages for example (zh-SG & zh-CN => zh-Hans, zh-TW, zh-HK, zh-MO => zh-Hant). BTW. Notice that I am using Language Tags... – Paweł Dyda Dec 27 '11 at 20:31

3 Answers3

2

There is apparently RFC 4647, Matching of Language Tags, which describes the syntax of "language-ranges" for specifying the list of a user's preferred languages, as well as the "filtering" and "lookup" mechanisms for comparing and matching language-ranges to RFC 4646 language tags. RFC 4647 describes these mechanisms as:

Filtering produces a (potentially empty) set of language tags, whereas lookup produces a single language tag.

Community
  • 1
  • 1
Daniel Trebbien
  • 38,421
  • 18
  • 121
  • 193
1

The CLDR - Unicode Common Locale Data Repository has a proposed (as of 2015) algorithm based on language distance. Without the distance data this is not a solution, but is worth watching for a solution in the future.

Doug Domeny
  • 4,410
  • 2
  • 33
  • 49
1

I'm not aware of a standard per se.

However, the algorithm being used is a trivial consequence of the fact that locales are hierarchical. There is a (notional) root locale with no name. Beneath this are language-only locales (en, fr, etc). Beneath those are national locales (en_GB, en_US, etc). Beneath those are, optionally, variant locales (en_GB_Yorkshire, en_GB_cockney, etc - for realistic examples, look at Norway).

The natural way to find an appropriate resource is to start with the lowest, most specific, locale you can, and walk up the tree until you find something. So, starting with en_US_TX, you step up to en_US, then en, then the root.

Tom Anderson
  • 46,189
  • 17
  • 92
  • 133
  • 1
    Most of the time the resources will be provided by the application writer such that there is a hierarchy, but it could be that an app provides resources for `de` (the default) and `en-GB`. If the user's locale is `en-US`, then a strictly hierarchal resolution would result in `de`. It would be preferrable in this case to move sideways in the hierarchy. Also, perhaps if resources for one language are not available, the nearest match should be a similar language (e.g. no resources for Ukrainian might return Russian resources rather than the default English resources). – Daniel Trebbien Dec 27 '11 at 18:15
  • 1
    In that case, a strictly hierarchical resolution would result in nothing being found. Neither de not en_GB is a match for en_US. Your suggested sideways moves both seem like a bad idea to me: you should either support a locale properly, or not at all. It is therefore important to let users choose their locale: Ukrainians might choose Russian if that's the best they can get, but it should not be dumped on them as a best fit for Ukrainian. – Tom Anderson Dec 27 '11 at 18:29
  • How to reuse JDK 7's locale resolution code given a locale in question and a few existing known locales? – curious1 Jun 02 '13 at 15:24