2

Today I became aware of the ECMAScript Internationalisation API, as I was researching a sane way to format numbers. Thus I tested for German by calling

Intl.NumberFormat("de").format(10000.23)

on the console in Firefox and Chrome, which provides me with "10.000,23".

However this form of number formatting is discouraged in German according to the DIN 5008 standard (and the official language reference "Duden"), which says that thin spaces (\u2009) should be used as a thousand separator.

Who defined the use of this thousand separator for German localisation? The Unicode Consortium or the browser vendors?

(Yes, I am aware that some programs and persons may use the dot as German localisation approach)

Pat Mächler
  • 529
  • 4
  • 14

2 Answers2

0

The definition is stemming from the CLDR (Unicode Common Locale Data Repository)

I filed a bug report about this issue there.

Pat Mächler
  • 529
  • 4
  • 14
0

It's in parts a matter of "better safe than sorry". DIN 5008 section 6.4: "The separator for amounts of money should be the period" (my paraphrasing) and how does a program know that?. Typography with spaces is also a problem in HTML and in most programming languages. You can easily replace periods with spaces but it's more complicated the other way around.

So, it's not a bug, it was done intentionally.

deamentiaemundi
  • 5,502
  • 2
  • 12
  • 20
  • It is true that section 6.4 suggests to deviate from the number formatting standard for currencies out of "security reasons", but then again I would find it hard to justify to make this the default in the CLDR, when another default was actually defined by DIN. I would rather assume that in such a case it is the responsibility of the program to treat currencies differently. Furthermore I don't think that "spaces are hard" is a good justification in a localization context either, as in other locales (e.g. Swedish) in the CLDR the thousand separator is defined as a space character. – Pat Mächler Nov 27 '15 at 17:01
  • OK, if it's already done with spaces elsewhere, well, that makes *that* argument moot, of course. BTW: spaces are not *hard* per se, just more complex to handle because space (ASCII 0x20) is a very common token delimiter. And the Swedish version seems to be wrong too: the space there has to be a non-breaking space (e.g.: \u00a0 or in the German case \u202f). Or has that changed? – deamentiaemundi Nov 27 '15 at 17:22
  • Sorry for being imprecise: actually \u2009 (thin spaces), not the ASCII 0x20 character should be used. I correct my post above accordingly. – Pat Mächler Nov 27 '15 at 17:35
  • \u2009 is too narrow (.2em or less) , I've learned (but that's a loooong time ago) to use a quarter (.25em), so \u202f would be a better fit. And \u202f has a no-break included which \u2009 has not. – deamentiaemundi Nov 27 '15 at 17:49