1

So I have an XML Soap response with date / time fields, which are represented as follows:

<BusStopTime>
    <BusStopId>1023</BusStopId>
    <Order>1</Order>
    <PassingTime>1899-12-30T07:20:00</PassingTime>
</BusStopTime>

I'm not interested in the date (as this is some legacy representation which I don't have control over) but the time. The field is transformed to XMLGregorianCalendar by the WS tooling and I'm aiming to do the conversion.

var date = DatatypeFactory.newInstance()
    .newXMLGregorianCalendar("1899-12-30T07:20:00")
    .toGregorianCalendar().toInstant()

Converting to LocalDateTime is siLocalimple. I'm setting TimeZone explicitly to avoid locality confuction

LocalDateTime.ofInstant(date, ZoneId.of("Europe/Warsaw"))

which results in 1899-12-30T07:44

LocalDateTime.ofInstant(date, ZoneId.of("Europe/Berlin"))

gives me a different output 1899-12-30T07:20

When dates start in modern days (after 1900 and after) - everything works fine. So the question is: what exactly happened between Berlin and Warsaw on the turn of XIX century? Or put it more clearly - why the change in the time is so weird?

I'm running is on both JDK8 and JDK11 (observing the same behavior)

{ ~ }  » java -version                                                                                                                                              
openjdk version "11.0.1" 2018-10-16
OpenJDK Runtime Environment 18.9 (build 11.0.1+13)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.1+13, mixed mode)

java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
Ole V.V.
  • 81,772
  • 15
  • 137
  • 161
Jakub Marchwicki
  • 788
  • 9
  • 21
  • What is your concern? – AsthaUndefined Jan 23 '19 at 11:33
  • 2
    Warzaw was at offset +01:24 from GMT all the time up to 1915 ([source](https://www.timeanddate.com/time/zone/poland/warsaw)). Berlin at +01:00 from 1894 to 1915 ([source](https://www.timeanddate.com/time/zone/germany/berlin)). It seems this would be part of the explanation. But nothing special happened at the turn of the century in either place, so it doesn’t fully explain. – Ole V.V. Jan 23 '19 at 11:54
  • When I run your code, I get exactly the same difference for dates after 1900. For example `1914-12-30T07:20:00` too becomes 7:44 in Warsaw and 7:20 in Berlin. – Ole V.V. Jan 23 '19 at 12:53
  • When I try to create an `XMLGregorianCalendar` from you example string, `1899-12-30T07:20:00`, in different time zones, I do get a couple of surprising results, though. If you can get the string out of the XML, use `LocalDateTime.parse("1899-12-30T07:20:00")` for a predictable result. – Ole V.V. Jan 23 '19 at 13:02
  • 1
    Thank you Ole! I was thinking about timezones but couldn't find any reference. Your sources are great. I didn't have problems with conversion `(LocalDateTime.parse(xmllGregoriaCalendar.toXmlFormat())` - worked like a charm. But was curious about the algorithm and rules themselves. If you've added it as an answer I would've had happily accept it – Jakub Marchwicki Jan 23 '19 at 14:16
  • Somehow related: [A bug in Java XMLGregorianCalendar conversion to a Java util.Date?](https://stackoverflow.com/questions/4282650/a-bug-in-java-xmlgregoriancalendar-conversion-to-a-java-util-date) – Ole V.V. Jan 23 '19 at 19:19

1 Answers1

2

LocalDateTime.parse()

If you can get the string out of the XML without too much trouble, for a predictable result use

    LocalDateTime.parse("1899-12-30T07:20:00")

Edit: Without direct access to the string I suggest that the solution is to set an offset from GMT/UTC on your XMLGregorianCalendar to avoid any dependency on the JVM’s default time zone:

    XMLGregorianCalendar xgc = DatatypeFactory.newInstance()
            .newXMLGregorianCalendar("1899-12-30T07:20:00");
    xgc.setTimezone(0);
    LocalTime time = xgc.toGregorianCalendar()
            .toZonedDateTime()
            .toLocalTime();
    System.out.println(time);

Since the so-called “timezone” of an XMLGregorianCalendar is really nothing but a fixed offset, it doesn’t matter which value we set. The output from this snippet consistently is:

07:20

I have tested with nine different default time zones including Europe/Warsaw.

Since you said you were only interested in the time of day, not the date, I have converted to LocalTime. If you want a LocalDateTime as in your question, just use toLocalDateTime instead of toLocalTime,

Alternatively, this was your easy solution from your comment:

    LocalDateTime.parse(xmllGregoriaCalendar.toXMLFormat​())

toXMLFormat​() recreates the string from the XML that the XMLGregorianCalendar object was created from (the docs guarantee that you get the same string back). So this way too evades all time zone problems.

Edit: disagreement between old and new classes

It seems to me that the heart of the problem lies in the old and outdated TimeZone class and the modern ZoneId class not agreeing about historic offsets from GMT/UTC.

I did a couple of experiments. Let’s first try the time zone that seems to work correctly, Berlin. Berlin was at offset +01:00 from 1894 to 1915. Java knows that:

    LocalDate baseDate = LocalDate.of(1899, Month.DECEMBER, 30);

    ZoneId berlin = ZoneId.of("Europe/Berlin");
    TimeZone tzb = TimeZone.getTimeZone(berlin);
    GregorianCalendar gcb = new GregorianCalendar(tzb);
    gcb.set(1899, Calendar.DECEMBER, 30);
    ZonedDateTime zdtb = baseDate.atStartOfDay(berlin);
    System.out.println("" + berlin + ' ' + tzb.getOffset(gcb.getTimeInMillis())
            + ' ' + berlin.getRules().getOffset(zdtb.toInstant())
            + ' ' + berlin.getRules().getOffset(zdtb.toInstant()).getTotalSeconds());

Output from this snippet is:

Europe/Berlin 3600000 +01:00 3600

The offset for 30 December 1899 is given correctly as +01:00. The TimeZone class says 3 600 000 milliseconds, ZoneId says 3600 seconds, so they agree.

The trouble is with Warsaw. Warsaw was at GMT offset +01:24 all the time up to 1915. Let’s see if Java can find out:

    ZoneId warsaw = ZoneId.of("Europe/Warsaw");
    TimeZone tzw = TimeZone.getTimeZone(warsaw);
    GregorianCalendar gcw = new GregorianCalendar(tzw);
    gcw.set(1899, Calendar.DECEMBER, 30);
    ZonedDateTime zdtw = baseDate.atStartOfDay(warsaw);
    System.out.println("" + warsaw + ' ' + tzw.getOffset(gcw.getTimeInMillis())
            + ' ' + warsaw.getRules().getOffset(zdtw.toInstant())
            + ' ' + warsaw.getRules().getOffset(zdtw.toInstant()).getTotalSeconds());

Europe/Warsaw 3600000 +01:24 5040

ZoneId correctly says +01:24 or 5040 seconds, but here TimeZone says 3 600 000 milliseconds, the same as in the Berlin case. This is incorrect.

The old GregorianCalendar class relies on the old TimeZone class and therefore produces wrong results when using Europe/Warsaw time zone (either explicitly or as default). In particular you get the wrong Instant from Calendar.toInstant(). And exactly because LocalDateTime.ofInstant uses a modern ZoneId, the error is carried on into your LocalDateTime.

Also from Europe/Dublin, Europe/Paris, Europe/Moscow and Asia/Kolkata time zones I get contradictory results.

I have run my snippets on Java 1.8.0_131, Java 9.0.4 and Java 11. The results were the same on all versions.

Links

Ole V.V.
  • 81,772
  • 15
  • 137
  • 161
  • 1
    I was thinking about a bug as well but `GregorianCalendar` just holds timezone information correctly. Converting `GregorianCalendar` to Instance results in a proper UTC date. The moment things get weird is when timezones (`ZonedDateTime.ofInstant(date, ZoneId.of("Europe/London"))`) are applied. The result is coherent with the timezone spec in links you've provided (timezone seems to be calculated according to a year- which seem legit) so at this moment I'm more inclined towards my initial lack of understanding of timezones than an actual bug. – Jakub Marchwicki Jan 23 '19 at 23:35