DateTimeFormatter month pattern letter "L" fails

Question

I noticed that java.time.format.DateTimeFormatter is not able to parse out as expected. See below:

import java.time.LocalDate;
import java.time.format.DateTimeFormatter;

public class Play {
  public static void tryParse(String d,String f) {
    try { 
      LocalDate.parse(d, DateTimeFormatter.ofPattern(f)); 
      System.out.println("Pass");
    } catch (Exception x) {System.out.println("Fail");}
  }
  public static void main(String[] args) {
    tryParse("26-may-2015","dd-L-yyyy");
    tryParse("26-May-2015","dd-L-yyyy");
    tryParse("26-may-2015","dd-LLL-yyyy");
    tryParse("26-May-2015","dd-LLL-yyyy");
    tryParse("26-may-2015","dd-M-yyyy");
    tryParse("26-May-2015","dd-M-yyyy");
    tryParse("26-may-2015","dd-MMM-yyyy");
    tryParse("26-May-2015","dd-MMM-yyyy");
  }
}

Only the last attempt with tryParse("26-May-2015","dd-MMM-yyyy"); will "Pass". As per the documentation LLL should be able to parse out textual format. Also note the subtle difference of the uppercase 'M' vs lowercase 'm'.

This is really annoying, as I cannot by default parse out strings formatted by default by Oracle DB

SELECT TO_DATE(SYSDATE,'DD-MON-YYYY') AS dt FROM DUAL;

Similarly, for following program:

import java.time.LocalDate;
import java.time.format.DateTimeFormatter;

public class Play {
  public static void output(String f) {
    LocalDate d = LocalDate.now();
    Locale l = Locale.US;
    // Locale l = Locale.forLanguageTag("ru");
    System.out.println(d.format(DateTimeFormatter.ofPattern(f,l)));
  }
  public static void main(String[] args) {
    output("dd-L-yyyy");
    output("dd-LLL-yyyy");
    output("dd-M-yyyy");
    output("dd-MMM-yyyy");
  }
}

I get below output:

28-5-2015
28-5-2015
28-5-2015
28-May-2015

Clearly the L Format specifier doesn't treat anything textual, seems numeric to me ...

However, if I change the Locale to Locale.forLanguageTag("ru"), I get the following output:

28-5-2015
28-Май-2015
28-5-2015
28-мая-2015

All really interesting, wouldn't you agree?

The questions I have are:

Is it reasonable for me to expect that each of the should work?
Should we at least submit some of these as a bug?
Do I misunderstand the usage of the L pattern specifier.

Quoting a part from the documentation that I percieved as 'it matters':

Text: The text style is determined based on the number of pattern letters used. Less than 4 pattern letters will use the short form. Exactly 4 pattern letters will use the full form. Exactly 5 pattern letters will use the narrow form. Pattern letters 'L', 'c', and 'q' specify the stand-alone form of the text styles.

Number: If the count of letters is one, then the value is output using the minimum number of digits and without padding. Otherwise, the count of digits is used as the width of the output field, with the value zero-padded as necessary. The following pattern letters have constraints on the count of letters. Only one letter of 'c' and 'F' can be specified. Up to two letters of 'd', 'H', 'h', 'K', 'k', 'm', and 's' can be specified. Up to three letters of 'D' can be specified.

Number/Text: If the count of pattern letters is 3 or greater, use the Text rules above. Otherwise use the Number rules above.

UPDATE

I have made two submissions to Oracle:

Request for Bugfix for the LLL (Long Form Text) issue: JDK-8114833 (original oracle Review ID: JI-9021661)
Request for enhancement for the lowercase month parsing issue: Review ID: 0 (is that also a bug??)

From my (limited) testing `L` stands for `5` or `05` (for May), where as `M` can stand for `5` (M) or `05` (MM) or `May` (MMM). I think the `DateTimeFormatter` is been very strict in it's parsing, is that a bug or is that the way it was designed? Hard to say right now, but I would say it's a design choose — MadProgrammer, May 29 '15 at 00:24
@MadProgrammer The documentation states: "Pattern letters 'L', 'c', and 'q' specify the stand-alone form of the text styles". — YoYo, May 29 '15 at 00:27
Sure, but from your test and my testing, `L` is for numbers, but `M`, based on how many you have can mean both numbers and text, try `System.out.println(DateTimeFormatter.ofPattern("dd-LLL-yyyy").format(LocalDate.now()));` and see ;) — MadProgrammer, May 29 '15 at 00:30
In the JavaDoc: M and L are handled with presentation "number/text". Both letters are even noted as "M/L". So the pattern LLL must be textual like MMM (an abbreviation) not numerical hence the observed behaviour is a bug. About the patterns with single letters M and L, the numerical output is okay. — Meno Hochschild, May 29 '15 at 09:56
Surprising for me: Even the low-level-approach using `new DateTimeFormatterBuilder().appendText(ChronoField.MONTH_OF_YEAR, TextStyle.SHORT_STANDALONE).toFormatter(Locale.US)` still produces a number not a text (in English "3" instead of "Mar" for March). So I fear no workaround within the context of JSR-310 (java.time-package) is available for you, only with an external library. I am excited to know if Oracle qualifies this as a bug or as a feature or wait for long time until the bug becomes a feature. — Meno Hochschild, May 29 '15 at 10:32
By the way, may I ask you why do you use Oracle-DBs formatting capabilities? Why not just use a local-neutral numerical form in Oracle? And what is your current workaround? — Meno Hochschild, May 29 '15 at 19:38
@MenoHochschild submitted 2 items with oracle. I have a layer of Unix in the middle of my DB and my Java programs, hence depending on textual representations not internal binary date formats. My workaround was just to agree on a format that did not depend on the textual representation of a month, I choose dd-MM-yyyy. — YoYo, Jun 16 '15 at 03:19
@JoD.Or you choose ISO-8601-format like yyyy-MM-dd which is exactly designed for technical data exchange. Extra advantage: Its lexicographical order is even a chronological one. Hm, I cannot find JI-9021661. Have you any public link to this report? — Meno Hochschild, Jun 16 '15 at 11:31
@MenoHochschild This looks to be like a temporary ID as it waits internal review. I just got a reply from an Oracle engineer for more information, which gave me a new reference # Incident Report 9059262. No links. ISO-8601 was my first choice, but wasn't supported by a 3rd party tool also in use. — YoYo, Jun 16 '15 at 16:03

score 27 · Accepted Answer · edited Jan 23 '18 at 22:27

27

“stand-alone” month name

I believe 'L' is meant for languages that use a different word for the month itself versus the way it is used in a date. For example:

Locale russian = Locale.forLanguageTag("ru");

asList("MMMM", "LLLL").forEach(ptrn -> 
    System.out.println(ptrn + ": " + ofPattern(ptrn, russian).format(Month.MARCH))
);

Output:

MMMM: марта
LLLL: Март

There shouldn't be any reason to use 'L' instead of 'M' when parsing a date.

I tried the following to see which locales support stand-alone month name formatting:

Arrays.stream(Locale.getAvailableLocales())
    .collect(partitioningBy(
                loc -> "3".equals(Month.MARCH.getDisplayName(FULL_STANDALONE, loc)),
                mapping(Locale::getDisplayLanguage, toCollection(TreeSet::new))
    )).entrySet().forEach(System.out::println);

The following languages get a locale-specific stand-alone month name from 'LLLL':

Catalan, Chinese, Croatian, Czech, Finnish, Greek, Hungarian, Italian, Lithuanian, Norwegian, Polish, Romanian, Russian, Slovak, Turkish, Ukrainian

All other languages get "3" as a stand-alone name for March.

edited Jan 23 '18 at 22:27

Basil Bourque

303,325
100
852
1,154

answered May 29 '15 at 00:57

Misha

27,433
6
62
78

Your output for марта - is that considered 4 or 5 characters in Russian? Why is it that MMMM would output 5 vs LLLL 4 ... ? – YoYo May 29 '15 at 01:05
3

From DateTimeFormatter javadoc: "Exactly 4 pattern letters will use the full form". For example, `.ofPattern("MMMM").format(Month.DECEMBER)` will produce "December" – Misha May 29 '15 at 01:07
Ok ... I have output todays date (month may) in the "ru" locale, I get really interesting results ... updated the Question. – YoYo May 29 '15 at 01:21
3

The result is as expected. "28-мая-2015" is "28-of may-2015" . "28-Май-2015" means "28-the month may-2015". 'M' should be used to format the month as a part of a date. 'L' should be used to format the month by itself. Although since 'L' fails for English and a number of other languages I tried, perhaps 'L' shouldn't be used at all. – Misha May 29 '15 at 01:25
@Misha What do you mean with *'L' fails for English and a number of other languages*? This sounds like a bug which could be filed... – Puce May 29 '15 at 07:39
2

As the question shows, "LLLL" fails to produce full name of the month in English but rather gives you a number. So if you were making a calendar application with a month view and tried to use "LLLL" to get locale-specific stand-alone name of the month, it would correctly give you the name in Russian, but would just give you a number in English. – Misha May 29 '15 at 07:56
@Misha My upvote for your observations. And I have tested it even with low-level-builder (see my comment above), the same result so the root cause must be far down the stack, not just in the layer where the pattern is translated to builder-calls. By the way. your language-list supporting stand-alone-mode is probably not complete. At least in CLDR-data (although not necessarily in JDK-resources) the language spanish ("es") also knows special standalone forms (using different capitalization of month names). – Meno Hochschild May 29 '15 at 10:52
1

@MenoHochschild Right, that experiment was just to see what jdk does, not to determine which languages are supposed to have a standalone form for the month. – Misha May 29 '15 at 11:18
@Misha but even if there is no 'standalone' form (finally I figured what you mean with that), it should revert back to the Long MMM or MMMM form, not to the numeric form, wouldn't you agree? – YoYo May 29 '15 at 18:49

neuronaut · Answer 2 · 2015-05-29T00:52:57.167

8

According to the javadocs:

Pattern letters 'L', 'c', and 'q' specify the stand-alone form of the text styles.

However, I couldn't find much about what the "stand-alone" form is supposed to be. In looking at the code I see that using 'L' selects TextStyle.SHORT_STANDALONE and according to that javadoc:

Short text for stand-alone use, typically an abbreviation. For example, day-of-week Monday might output "Mon".

However, that isn't how it seems to work. Even with three letters I get numerical output from this code:

DateTimeFormatter pattern = DateTimeFormatter.ofPattern ("dd-LLL-yyyy");
System.out.println (pattern.format (LocalDate.now ()));

Edit

After further investigation it seems (as near as I can tell) that the "stand-alone" versions of these codes are for when you want to load your own locale-independent data, presumably using DateTimeFormatterBuilder. As such, by default DateTimeFormatter has no entries loaded for TextStyle.SHORT_STANDALONE.

edited May 29 '15 at 00:52

answered May 29 '15 at 00:30

neuronaut

2,689
18
24

Thanks, I have updated my question with some of your feedback. I just want to slowly get a complete documentation of the issue so I can easily submit it as a bug (dont feel bad I am bluntly copying it plse). – YoYo May 29 '15 at 00:43
2

@JoD. I've updated my answer with a little extra info. Hopefully it helps a bit, but it seems that you may just want to rely on the non-stand-alone codes unless you have need of locale-independence and are prepared to do the extra work needed to get things working the way you want. – neuronaut May 29 '15 at 00:54
Not sure what that all means (stand-alone etc). However - I tried to force locale-independence by hardcoding to Locale.US ... still same result. – YoYo May 29 '15 at 00:57
2

@JoD. picking a particular locale is the very opposite of locale-independence -- you still end up loading the data for the selected locale (US in your case) which won't have the stand-alone entries. The point is that if you don't provide them manually you won't get them. – neuronaut May 29 '15 at 17:39

score 1 · Answer 3 · answered Jun 02 '19 at 08:17

1

While the other answers give excellent information on pattern letter L and date parsing, I should like to add that you should really avoid the problem altogether. Don’t get date (and time) as string from your database. Instead use an appropriate datetime object.

    String sql = "select sysdate as dt from dual;"
    PreparedStatement stmt = yourDatabaseConnection.prepareStatement(sql);
    ResultSet rs = stmt.executeQuery();
    if (rs.next()) {
        LocalDateTime dateTime = rs.getObject("dt", LocalDateTime.class);
        // do something with dateTime
    }

(Not tested since I haven’t got an Oracle database at hand. Please forgive any typo.)

answered Jun 02 '19 at 08:17

Ole V.V.

81,772
15
137
161

1

The java code does not access the database directly: the medium at which the dates are transferred are either data as files, parameters through environment variables, or command line parameters. So although this is generally good advise, it is not always possible because an extra layer of separation (another tool in between). – YoYo Jun 02 '19 at 13:58
That I can understand, @YoYo. That layer in between ought not pass date and time as a string either. Sometimes such things are outside our control, I know. Thanks for your comment. – Ole V.V. Jun 02 '19 at 14:39

DateTimeFormatter month pattern letter "L" fails

3 Answers3

“stand-alone” month name

Linked

Related