0

I used SimpleDateFormat and DateTimeFormatter respectively to parse:

val text = "Tue, 07 Mar 2023 15:32:23 +0800"
val pattern = "EEE, dd MMM yyyy HH:mm:ss zzz"
val date = SimpleDateFormat(pattern, Locale.ENGLISH).parse(text)
println(date)
val dateTime = LocalDateTime.parse(text, DateTimeFormatter.ofPattern(pattern, Locale.ENGLISH))
println(dateTime)

But DateTimeFormatter got an error:

java.time.format.DateTimeParseException:
Text 'Tue, 07 Mar 2023 15:32:23 +0800' could not be parsed at index 26

I know it can be successfully parsed using the built-in type DateTimeFormatter.RFC_1123_DATE_TIME. But I want to know where is the problem with this pattern? Why does DateTimeFormatter go wrong?

I found a related question, but it's not the same as mine, his timezone is GMT, mine is +0800

deHaar
  • 17,687
  • 10
  • 38
  • 51
SageJustus
  • 631
  • 3
  • 9
  • 1
    Try using x instead of x for the Timezone (offset) – MadProgrammer Mar 07 '23 at 08:45
  • Yes, I agree, you should use `DateTimeFormatter.RFC_1123_DATE_TIME`. And you should never, ever use `SimpleDateFormat`. And yes, I understand your curiosity. I think the main thing is in the answer, no reason for me to repeat. I may add that one of the many problems with `SimpleDateFormat` was that it was overly tolerant with incorrect input, so I am not surprised when you tell me it parsed you string given the given format string. It makes proper validation about impossible. – Ole V.V. Mar 07 '23 at 18:58

1 Answers1

2

+0800 is not a time zone, it is an offset from UTC.

A DateTimeFormatter distingishes between zone (ZoneId) and offset (ZoneOffset) and supports different pattern characters that's why it refuses to parse your text

I would parse a String that contains an offset from UTC to an OffsetDateTime instead of a LocalDateTime, because the latter would strip off the information about the offset and simply keep a date and a time of day. But that obviously depends on the specific requirements…

That means you will have to introduce a different pattern if you want to parse this input String with a DateTimeFormatter.

You could use xxxx instead of zzz in that pattern (the rest stays the same, but better use uuuu instead of yyyy).

fun main() {
    // input String
    val text = "Tue, 07 Mar 2023 15:32:23 +0800"
    // pattern for the SimpleDateFormat
    val patternSdf = "EEE, dd MMM yyyy HH:mm:ss zzz"
    // pattern for the DateTimeFormatter
    val patternDtf = "EEE, dd MMM yyyy HH:mm:ss xxxx"
    // get a java.util.Date by parsing the String with the SimpleDateFormat
    val date = SimpleDateFormat(patternSdf, Locale.ENGLISH).parse(text)
    // print the Date
    println(date)
    // get a LocalDateTime by parsing the String with the DateTimeFormatter
    val dateTime = LocalDateTime.parse(
                        text,
                        DateTimeFormatter.ofPattern(patternDtf, Locale.ENGLISH)
                   )
    // print it
    println(dateTime);
    // get an OffsetDateTime using the same approach
    val offsetDateTime = OffsetDateTime.parse(
                             text,
                             DateTimeFormatter.ofPattern(patternDtf, Locale.ENGLISH)
                         )
    // print it
    println(offsetDateTime)
}

Output:

Tue Mar 07 07:32:23 UTC 2023
2023-03-07T15:32:23
2023-03-07T15:32:23+08:00

From JavaDocs of DateTimeFormatter:

Offset X and x:
This formats the offset based on the number of pattern letters. One letter outputs just the hour, such as '+01', unless the minute is non-zero in which case the minute is also output, such as '+0130'. Two letters outputs the hour and minute, without a colon, such as '+0130'. Three letters outputs the hour and minute, with a colon, such as '+01:30'. Four letters outputs the hour and minute and optional second, without a colon, such as '+013015'. Five letters outputs the hour and minute and optional second, with a colon, such as '+01:30:15'. Six or more letters throws IllegalArgumentException. Pattern letter 'X' (upper case) will output 'Z' when the offset to be output would be zero, whereas pattern letter 'x' (lower case) will output '+00', '+0000', or '+00:00'.

Offset Z:
This formats the offset based on the number of pattern letters. One, two or three letters outputs the hour and minute, without a colon, such as '+0130'. The output will be '+0000' when the offset is zero. Four letters outputs the full form of localized offset, equivalent to four letters of Offset-O. The output will be the corresponding localized offset text if the offset is zero. Five letters outputs the hour, minute, with optional second if non-zero, with colon. It outputs 'Z' if the offset is zero. Six or more letters throws IllegalArgumentException.

deHaar
  • 17,687
  • 10
  • 38
  • 51
  • I read your answer and checked the documentation and did find that `z` does not apply to `+0800`. But I found that in the documentation `Z` and `x` have the same example `+0000`, but in my case `Z` doesn't apply. Can you tell me the difference between `offset-x` and `offset-Z`? – SageJustus Mar 07 '23 at 10:16
  • You can change to `val patternDtf = "EEE, dd MMM yyyy HH:mm:ss Z"` and parsing will work. It depends on the amount of pattern letters (`ZZZZ` will not work). See my edit for the explanation from the JavaDocs. – deHaar Mar 07 '23 at 10:34
  • Using a `Z` works fine. I've read the documentation before, I just didn't really understand what it meant, I don't use `DateTimeFormatter` very often. In my case, there doesn't seem to be any functional difference between `Z` and `x`, maybe just a difference in the way they are written? – SageJustus Mar 07 '23 at 14:13
  • The difference is the variance of offset formats they can parse, meaning a `x` can parse offsets without a colon and without seconds, like `+0800` and `+08`, but a single `Z` can only parse `+0800`. And their count matters, e.g. `xx` seems to be equivalent to `Z` parsing only `+0800` while `xxx` can only parse `+08:00` and `xxxx` handles `+0800` and `+080015`, but not `+08`… There`s also `z`, which can handle the ones with the colons. – deHaar Mar 07 '23 at 15:01