Java equivalent of Invariant Culture

Question

I am converting the following C# code to Java. Is there a Java equivalent to the .NET concept of Invariant Culture?

string upper = myString.ToUpperInvariant();

Since the Invariant Culture is really just the US culture, I could just do something like this in Java, but I'm wondering if there is a better way:

String upper = myString.toUpperCase(Locale.US);

Jon Skeet · Accepted Answer · 2017-10-13T13:57:22.807

Update: Java 6 introduced Locale.ROOT which is described as:

This is regarded as the base locale of all locales, and is used as the language/country neutral locale for the locale sensitive operations.

This is probably better than using US, but I haven't checked it against the code below.

No, that's basically the right way to go. While there are differences between the US culture and the invariant culture in terms of formatting, I don't believe they affect casing rules.

EDIT: Actually, a quick test program shows there are characters which are upper-cased differently in .NET in the US culture to in the invariant culture:

using System;
using System.Globalization;

class Test
{
    static void Main()
    {
        CultureInfo us = new CultureInfo("en-US");
        for (int i = 0; i < 65536; i++)
        {
            char c = (char) i;
            string s = c.ToString();
            if (s.ToUpperInvariant() != s.ToUpper(us))
            {
                Console.WriteLine(i.ToString("x4"));
            }
        }
    }    
}

Output:

00b5
0131
017f
01c5
01c8
01cb
01f2
0345
0390
03b0
03c2
03d0
03d1
03d5
03d6
03f0
03f1
03f5
1e9b
1fbe

I don't have time to look at these right now, but it's worth investigating. I don't know if the same differences would apply in Java - you probably want to take a sample of them and work out what you want your code to do.

EDIT: And just to be completist, it's worth mentioning that that only checks for individual characters... whereas you're really upper-casing whole strings, which can make a difference.

Looking at the Java code for upper-casing, that appears to only have locale-specific behaviour for tr, az and lt countries. I know that tr is Turkey, but I don't know about the others...

What about [Locale.ROOT](https://docs.oracle.com/javase/8/docs/api/java/util/Locale.html#ROOT)? (Per [related question/answer here](https://stackoverflow.com/questions/41108804/what-is-the-c-sharp-equivalent-of-javas-locale-root-and-locale-getdefault).) — Av Pinzur, Oct 13 '17 at 13:52
@AvPinzur: Yup, that's probably the most appropriate. I'll update to mention it. — Jon Skeet, Oct 13 '17 at 13:56

score 0 · Answer 2 · answered Mar 15 '11 at 21:44

This looks the most invariant you can get w/o using any Locale. If you care for the extended Unicode (past UTF16), you will need to go for codePoint solution (if you don't know about the codepoints you don't need it :) )

 static String toUpperCase(String s){
    char[] c = s.toCharArray();
    for (int i=0;i<c.length;i++){
        c[i]=Character.toUpperCase(c[i]);
    }
    return String.copyValueOf(c);  
 }

Java equivalent of Invariant Culture

2 Answers2

Linked