Update: Java 6 introduced Locale.ROOT
which is described as:
This is regarded as the base locale of all locales, and is used as the language/country neutral locale for the locale sensitive operations.
This is probably better than using US, but I haven't checked it against the code below.
No, that's basically the right way to go. While there are differences between the US culture and the invariant culture in terms of formatting, I don't believe they affect casing rules.
EDIT: Actually, a quick test program shows there are characters which are upper-cased differently in .NET in the US culture to in the invariant culture:
using System;
using System.Globalization;
class Test
{
static void Main()
{
CultureInfo us = new CultureInfo("en-US");
for (int i = 0; i < 65536; i++)
{
char c = (char) i;
string s = c.ToString();
if (s.ToUpperInvariant() != s.ToUpper(us))
{
Console.WriteLine(i.ToString("x4"));
}
}
}
}
Output:
00b5
0131
017f
01c5
01c8
01cb
01f2
0345
0390
03b0
03c2
03d0
03d1
03d5
03d6
03f0
03f1
03f5
1e9b
1fbe
I don't have time to look at these right now, but it's worth investigating. I don't know if the same differences would apply in Java - you probably want to take a sample of them and work out what you want your code to do.
EDIT: And just to be completist, it's worth mentioning that that only checks for individual characters... whereas you're really upper-casing whole strings, which can make a difference.
Looking at the Java code for upper-casing, that appears to only have locale-specific behaviour for tr, az and lt countries. I know that tr is Turkey, but I don't know about the others...