1

I have to deals with SimpleDateFormat but I have issue with year of week values.

To narrow down the problem, I wrote the simple Java code below and found that it returns two different results with apparently the same settings (just by forcing local on command line). The problem is only with a Windows (US configured) machine: if I run the same test on a Linux (CentOS) machine, everything is ok.

JVM on Windows is zulu8 1.8.0_282 openjdk (but it seems I've the same behavior with oracle 8 jdk) while it's Red Hat 1.8.0_272 openjdk on Linux.

Here is the source code :

import java.util.Locale;
import java.util.Calendar;
import java.util.TimeZone;
import java.text.SimpleDateFormat;
import java.text.DateFormat;
import java.text.ParseException;

import java.time.LocalDate;
import java.time.temporal.WeekFields;

public class TestDate {
    public static void main(String args[]) throws ParseException {
        Locale currentLocale = Locale.getDefault();

        System.out.println(System.getProperty("java.vendor"));
        System.out.println(System.getProperty("java.version"));
        System.out.println("==============");
        System.out.printf("%20s = %s%n", "getDisplayLanguage", currentLocale.getDisplayLanguage());
        System.out.printf("%20s = %s%n", "getDisplayCountry", currentLocale.getDisplayCountry());
        System.out.printf("%20s = %s%n", "getDisplayVariant", currentLocale.getDisplayVariant());

        System.out.printf("%20s = %s%n", "getLanguage", currentLocale.getLanguage());
        System.out.printf("%20s = %s%n", "getCountry", currentLocale.getCountry());

        System.out.printf("%20s = %s%n", "user.country", System.getProperty("user.country"));
        System.out.printf("%20s = %s%n", "user.language", System.getProperty("user.language"));
        System.out.printf("%20s = %s%n", "user.variant", System.getProperty("user.variant"));

        System.out.println("==============");

        Calendar c = Calendar.getInstance();
        System.out.println("1st day of week / minimal days in 1st week : " + c.getFirstDayOfWeek() + " / " + c.getMinimalDaysInFirstWeek());

        System.out.println("==============");

        LocalDate date1 = LocalDate.of(2020, 12, 31);
        LocalDate date2 = LocalDate.of(2021, 1, 1);

        DateFormat df_date = new java.text.SimpleDateFormat("dd/MM/yyyy");
        DateFormat df_week = new java.text.SimpleDateFormat("YYYY-ww");

        System.out.printf("%20s | %10s | %10s%n", "", df_date.format(java.sql.Date.valueOf(date1)), df_date.format(java.sql.Date.valueOf(date2)));
        System.out.printf("%20s | %10s | %10s%n", "SimpleDateFormat", df_week.format(java.sql.Date.valueOf(date1)), df_week.format(java.sql.Date.valueOf(date2)));

        System.out.printf("%20s | %7d-%02d | %7d-%02d%n", "WeekFields",
                                        date1.get(WeekFields.ISO.weekBasedYear()), date1.get(WeekFields.ISO.weekOfWeekBasedYear()),
                                        date2.get(WeekFields.ISO.weekBasedYear()), date2.get(WeekFields.ISO.weekOfWeekBasedYear()));

    }
}

And here are the results (the second one is the expected one):

>java TestDate
Azul Systems, Inc.
1.8.0_282
==============
  getDisplayLanguage = English
   getDisplayCountry = United States
   getDisplayVariant =
         getLanguage = en
          getCountry = US
        user.country = US
       user.language = en
        user.variant =
==============
1st day of week / minimal days in 1st week : 2 / 4
==============
                     | 31/12/2020 | 01/01/2021
    SimpleDateFormat |    2020-53 |    2020-53
          WeekFields |    2020-53 |    2020-53

>java -Duser.language=en -Duser.country=US -Duser.variant= TestDate
Azul Systems, Inc.
1.8.0_282
==============
  getDisplayLanguage = English
   getDisplayCountry = United States
   getDisplayVariant =
         getLanguage = en
          getCountry = US
        user.country = US
       user.language = en
        user.variant =
==============
1st day of week / minimal days in 1st week : 1 / 1
==============
                     | 31/12/2020 | 01/01/2021
    SimpleDateFormat |    2021-01 |    2021-01
          WeekFields |    2020-53 |    2020-53

Both seems to use the same locale settings but SimpleDateFormat returns different week/year of week. Am I missing some locale settings?

Thank you for your help.

EDIT with Oracle JDK :

>java TestDate
Oracle Corporation
1.8.0_202
==============
  getDisplayLanguage = English
   getDisplayCountry = United States
   getDisplayVariant =
         getLanguage = en
          getCountry = US
        user.country = US
       user.language = en
        user.variant =
==============
1st day of week / minimal days in 1st week : 2 / 4
==============
                     | 31/12/2020 | 01/01/2021
    SimpleDateFormat |    2020-53 |    2020-53
          WeekFields |    2020-53 |    2020-53

>java -Duser.language=en -Duser.country=US -Duser.variant= TestDate
Oracle Corporation
1.8.0_202
==============
  getDisplayLanguage = English
   getDisplayCountry = United States
   getDisplayVariant =
         getLanguage = en
          getCountry = US
        user.country = US
       user.language = en
        user.variant =
==============
1st day of week / minimal days in 1st week : 1 / 1
==============
                     | 31/12/2020 | 01/01/2021
    SimpleDateFormat |    2021-01 |    2021-01
          WeekFields |    2020-53 |    2020-53

EDIT Calendar default Locale : As pointed out by Scratte, Calendar and SimpleDateFormat use a default Locale. I had a look on SimpleDateFormat source code and it uses Locale.getDefault(Locale.Category.FORMAT) as default Local which turns out to be different from the Locale.getDefault() I used in my code.

I finally have understood why I had 2 different behavior between both code: I did not display the correct Locale (I was not aware of the 3 distincts Locale ; thank you Ole V.V. for clarifying this).

TL;DR

SimpleDateFormat uses Locale.getDefault(Locale.Category.FORMAT) and my Java code was displaying values of Locale.getDefault(). The later was always en_US but the former was fr_FR or en_US depending on the command line I used. That's why I had two different output for the week / year.

Finally, JVM parameters -Duser.language= / -Duser.country= / -Duser.variant= are the solution (they force all the three different Locale)!

This new code shows the difference of the three different Locale:

import java.sql.Date;
import java.util.Locale;
import java.util.Calendar;
import java.util.TimeZone;
import java.text.SimpleDateFormat;
import java.text.DateFormat;
import java.text.ParseException;

import java.time.LocalDate;
import java.time.temporal.WeekFields;

public class TestDate {
    public static void main(String args[]) throws ParseException {
        Locale cL = Locale.getDefault();
        Locale cLD = Locale.getDefault(Locale.Category.DISPLAY);
        Locale cLF = Locale.getDefault(Locale.Category.FORMAT);

        System.out.println(System.getProperty("java.vendor"));
        System.out.println(System.getProperty("java.version"));
        System.out.println("==============");
        System.out.printf("%20s | %15s | %15s | %15s%n", "Locale.getDefault(.)", "", "DISPLAY", "FORMAT");
        System.out.printf("%20s | %15s | %15s | %15s%n", "getDisplayLanguage", cL.getDisplayLanguage(), cLD.getDisplayLanguage(), cLF.getDisplayLanguage());
        System.out.printf("%20s | %15s | %15s | %15s%n", "getDisplayCountry", cL.getDisplayCountry(), cLD.getDisplayCountry(), cLF.getDisplayCountry());
        System.out.printf("%20s | %15s | %15s | %15s%n", "getDisplayVariant", cL.getDisplayVariant(), cLD.getDisplayVariant(), cLF.getDisplayVariant());
        System.out.printf("%20s | %15s | %15s | %15s%n", "getLanguage", cL.getLanguage(), cLD.getLanguage(), cLF.getLanguage());
        System.out.printf("%20s | %15s | %15s | %15s%n", "getCountry", cL.getCountry(), cLD.getCountry(), cLF.getCountry());
        System.out.printf("%20s | %15s | %15s | %15s%n", "getVariant", cL.getVariant(), cLD.getVariant(), cLF.getVariant());

        System.out.printf("%20s = %s%n", "user.country", System.getProperty("user.country"));
        System.out.printf("%20s = %s%n", "user.language", System.getProperty("user.language"));
        System.out.printf("%20s = %s%n", "user.variant", System.getProperty("user.variant"));

        System.out.println("==============");

        Calendar c = Calendar.getInstance();
        System.out.println("1st day of week / minimal days in 1st week : " + c.getFirstDayOfWeek() + " / " + c.getMinimalDaysInFirstWeek());

        System.out.println("==============");

        LocalDate date1 = LocalDate.of(2020, 12, 31);
        LocalDate date2 = LocalDate.of(2021, 1, 1);

        DateFormat df_date = new java.text.SimpleDateFormat("dd/MM/yyyy");
        DateFormat df_week = new java.text.SimpleDateFormat("YYYY-ww");

        System.out.printf("%20s | %10s | %10s%n", "", df_date.format(java.sql.Date.valueOf(date1)), df_date.format(java.sql.Date.valueOf(date2)));
        System.out.printf("%20s | %10s | %10s%n", "SimpleDateFormat", df_week.format(java.sql.Date.valueOf(date1)), df_week.format(java.sql.Date.valueOf(date2)));

        System.out.printf("%20s | %7d-%02d | %7d-%02d%n", "WeekFields",
                                        date1.get(WeekFields.ISO.weekBasedYear()), date1.get(WeekFields.ISO.weekOfWeekBasedYear()),
                                        date2.get(WeekFields.ISO.weekBasedYear()), date2.get(WeekFields.ISO.weekOfWeekBasedYear()));

    }
}

And the corresponding outputs :

>java TestDate
Azul Systems, Inc.
1.8.0_282
==============
Locale.getDefault(.) |                 |         DISPLAY |          FORMAT
  getDisplayLanguage |         English |         English |          French
   getDisplayCountry |   United States |   United States |          France
   getDisplayVariant |                 |                 |
         getLanguage |              en |              en |              fr
          getCountry |              US |              US |              FR
          getVariant |                 |                 |
        user.country = US
       user.language = en
        user.variant =
==============
1st day of week / minimal days in 1st week : 2 / 4
==============
                     | 31/12/2020 | 01/01/2021
    SimpleDateFormat |    2020-53 |    2020-53
          WeekFields |    2020-53 |    2020-53
>java -Duser.language=en -Duser.country=US -Duser.variant= TestDate
Azul Systems, Inc.
1.8.0_282
==============
Locale.getDefault(.) |                 |         DISPLAY |          FORMAT
  getDisplayLanguage |         English |         English |         English
   getDisplayCountry |   United States |   United States |   United States
   getDisplayVariant |                 |                 |
         getLanguage |              en |              en |              en
          getCountry |              US |              US |              US
          getVariant |                 |                 |
        user.country = US
       user.language = en
        user.variant =
==============
1st day of week / minimal days in 1st week : 1 / 1
==============
                     | 31/12/2020 | 01/01/2021
    SimpleDateFormat |    2021-01 |    2021-01
          WeekFields |    2020-53 |    2020-53
lennelei
  • 17
  • 6
  • I would suggest testing this on Windows using the Oracle JDK (the license permits testing without the need to purchase a Java SE subscription). I suspect this is an OpenJDK issue rather than being Zulu specific. (I work for Azul). – Speakjava Sep 23 '21 at 14:50
  • 1
    Don't waste your time and energy with the legacy Date-Time API. The `java.util` Date-Time API and their formatting API, `SimpleDateFormat` are outdated and error-prone. It is recommended to stop using them completely and switch to the [modern Date-Time API](https://www.oracle.com/technical-resources/articles/java/jf14-Date-Time.html). – Arvind Kumar Avinash Sep 23 '21 at 15:05
  • @Speakjava : I made the same test with Oracle JDK and got the same results unfortunately. – lennelei Sep 24 '21 at 06:59
  • @ArvindKumarAvinash : I don't have choice. I'm using Talend ETL and it uses SimpleDateFormat for internal purpose. I need to understand how to correctly configure it to use it safely. – lennelei Sep 24 '21 at 07:01
  • The issues seems to be with the instantiation of the `Calendar`. It uses a default Locale. If you want to use a specific one, you'll need to use `Calendar c = Calendar.getInstance(Locale.US);` Likewise with `new java.text.SimpleDateFormat("YYYY-ww", Locale.US);` I think you cannot expect these classes to correctly obtain the information from the Operating System. – Scratte Sep 24 '21 at 09:13
  • @Scratte: good point! I had a look on `SimpleDateFormat` source code and it uses `Locale.getDefault(Locale.Category.FORMAT)` as default Locale. – lennelei Sep 24 '21 at 15:55
  • 1
    This returns `fr_FR` in the first run and `en_US` on the second. Is there any way to force this with JVM parameters? Thanks! – lennelei Sep 24 '21 at 16:04
  • The only way to force it (that I'm aware of) is to call the JVM with it. Which is what you're doing with `-Duser.language=en -Duser.country=US` :) – Scratte Sep 24 '21 at 16:08
  • @lennelei You have provided much interesting and helpful additional information. Thanks for doing that. You have provided some of it in comments where it’s harder to find again and where many readers won’t notice it at all. Please instead edit your question and paste everything in there (when information is provided in response to someone’s comment, also at-sign-tag that user in a comment and notify them that you have edited). – Ole V.V. Sep 25 '21 at 07:51

1 Answers1

0

I have not understood how the implementation by Talend ETL can be any of your business. If they have not yet found the opportunity for upgrading to java.time, the modern Java date and time API, it’s their problem, not yours. You should not use SimpleDateFormat nor Calendar in your own code.

Java has got 3 default locales

Java hasn’t just got one, it’s got three default locales, partly for historical reasons. They can be set individually. To demonstrate:

    Locale.setDefault(Locale.FRANCE);
    Locale.setDefault(Locale.Category.DISPLAY, Locale.JAPAN);
    Locale.setDefault(Locale.Category.FORMAT, Locale.GERMANY);
    
    System.out.println(Locale.getDefault());
    System.out.println(Locale.getDefault(Locale.Category.DISPLAY));
    System.out.println(Locale.getDefault(Locale.Category.FORMAT));

Output from this snippet is:

fr_FR
ja_JP
de_DE

The output reflects in order France, Japan and Germany (deutsch/Deutschland).

Your comment states that the code of SimpleDateFormat uses the default FORMAT locale as its default locale (so Germany in my example). That is, the locale that it uses when you don’t specify one (you should’t use SimpleDateFormat, if you do nevertheless, you should always specify locale explicitly).

As I said, the three can be set individually. The one-arg Locale.setDefault() sets all three, though.

Does this observation explain? On my Java 11 it seems that setting the locale on the command line sets all three default locales (until altered by Locale.setDefault()). I tried just

    System.out.println(Locale.getDefault());
    System.out.println(Locale.getDefault(Locale.Category.DISPLAY));
    System.out.println(Locale.getDefault(Locale.Category.FORMAT));

I ran this snippet with -Duser.language=en -Duser.country=US on the command line, and the output was:

en_US
en_US
en_US

Also other language and country setting came through in all three locales. So no, this doesn’t alone explain why your SimpleDateFormat in one case did not seem to pick up the locale from the command line.

Does this observation provide a solution?

I still haven’t understood what your real end goal is. The first recommendation is: Your code should not rely on the default locale of the JVM. Use explicit locale in your locale sensitive operations.

If you do need to set the default FORMAT locale for Talend ETL to work the way you require it to, Locale.setDefault(Locale.Category.FORMAT, Locale.US); should do it.

Link

Related question: Which "default Locale" is which?

MC Emperor
  • 22,334
  • 15
  • 80
  • 130
Ole V.V.
  • 81,772
  • 15
  • 137
  • 161
  • 1
    Actually, all three Locales are correctly forced by the JVM parameters. My real goal was to understand what was the correct settings to force Locale values ; my test code was wrong as I only looked at the `Locale.getDefault()` which is not the one used by `SimpleDateFormat`. – lennelei Sep 27 '21 at 08:16
  • 1
    Talend ETL provides internal tools for dates calculation and displaying dates ; as those tools use `SimpleDateFormat`, I had to find a way to force the Locale in order to have consistent results. – lennelei Sep 27 '21 at 08:20