3

In java you there is a javax.xml.datatype.DatatypeFactory which can be used to import and export xml dates as follows.

  String xmlDateIn = "1900-01-01T12:00:00";
  DatatypeFactory df = DatatypeFactory.newInstance();
  XMLGregorianCalendar xmlCalendar = df.newXMLGregorianCalendar(xmlDateIn);
  String xmlDateOut = xmlCalendar.toXMLFormat();

In this simple case xmlDateIn equals xmlDateOut, as expected. But if I want it as a java.lang.Date things get interesting.

  GregorianCalendar gregorianCalendar = xmlCalendar.toGregorianCalendar();
  Date dts = gregorianCalendar.getTime();
  System.out.println(dts);  // prints Mon Jan 01 12:00:00 CET 1900

It still works fine at first sight, but actually internally something seems to be broken. With my IDE I can see what is going on inside the Date object. (In case you wonder, I live in CET timezone.) Look at this strange time zone.

Timezone is broken

And when I try to convert this back to XML, the 9 minutes time zone actually gets printed as well. So, it's not just an internal thing.

  DatatypeFactory df2 = DatatypeFactory.newInstance();
  GregorianCalendar gc2 = new GregorianCalendar();
  gc2.setTime(dts);
  XMLGregorianCalendar xc2 = df2.newXMLGregorianCalendar(gc2);
  System.out.println(xc2.toXMLFormat()); // prints 1900-01-01T12:00:00.000+00:09 

In an attempt to fix it, if I set a timezone manually, things get really bad. Look at this magic hour:

  String xmlDateIn = "1900-01-01T12:00:00";
  DatatypeFactory df = DatatypeFactory.newInstance();
  XMLGregorianCalendar xmlCalendar = df.newXMLGregorianCalendar(xmlDateIn);
  xmlCalendar.setTimezone(0);  // <--- ONLY CHANGE
  GregorianCalendar gregorianCalendar = xmlCalendar.toGregorianCalendar();
  Date dts = gregorianCalendar.getTime();

Unable to fix.

Actually I have a workaround for my specific program: What works for me right now, is that I don't set a timezone when I import the xml. The Date then carries the wrong timezone internally, i.e. 9 minutes. Then when I finally want to export the Date back to xml, I do set the timezone to 0 on the xml gregorian calendar, and that magically fixes it and exports the correct xml format again.

But really, I was wondering if there is any good explanation for this crazy behavior.

bvdb
  • 22,839
  • 10
  • 110
  • 123
  • I cannot reproduce your results using Java 10.0.2. What version of Java are you using? – VGR Sep 17 '18 at 14:55
  • @VGR running version 1.8.0_101 – bvdb Sep 17 '18 at 15:27
  • 2
    It appears to be particular to that timezone. I am on EST/EDT, but when I change the default timezone to CET, I see the behavior your describe, both in Java 1.8.0_161 and in Java 10.0.2. – VGR Sep 17 '18 at 15:38
  • One thing that bothers me for more debugging, is that even if I put a breakpoint in the `Date(long)` constructor of `Date`, already at that point, the `cdate` field is magically instantiated. I cannot find the point of instantiation. Could it be some kind of JVM optimization ? – bvdb Sep 17 '18 at 15:53

2 Answers2

2

I don’t know a lot about calendar history and official timekeeping, so I tested this by first making sure I was using your timezone:

int offset = (int) TimeUnit.HOURS.toMillis(1);
String[] ids = TimeZone.getAvailableIDs(offset);
TimeZone cet = Arrays.stream(ids).map(TimeZone::getTimeZone)
    .filter(tz -> tz.getDisplayName(false, TimeZone.SHORT).equals("CET"))
    .findFirst().orElseThrow(
        () -> new RuntimeException("No CET timezone found"));

TimeZone.setDefault(cet);

Then I examined some of the inner workings of that timezone. In particular, I printed out its historical time transitions:

System.out.println("Transitions:");
cet.toZoneId().getRules().getTransitions().forEach(
    t -> System.out.println("  " + t));

The first two such transitions print out as:

Transition[Overlap at 1891-03-15T00:01+00:12:12 to +00:09:21]
Transition[Overlap at 1911-03-11T00:00+00:09:21 to Z]

And they are followed by various “manual” transitions between Zulu (Z) and UTC+01:00.

So, midnight in 1900 was actually 9 minutes and 21 seconds later than midnight on a corresponding day in 1912.

Indeed, if you change your year to 1912, you won’t see the 9-minute discrepancy:

String xmlDateIn = "1912-01-01T12:00:00";

I haven’t been able to find a historical reason for the 12:12 or 9:21 transitions. I assume it was just a matter of science catching up, as astronomical measurements got more accurate.

VGR
  • 40,506
  • 4
  • 48
  • 63
  • Nice analysis (+1). Be aware that CET is really many times zones sharing a common abbreviation, roughly one per central European country. Which one shows the behaviour you demonstrate I dont’t know, but certainly not all of them. I’d prefer to use the modern `ZoneId` and `ZoneRules` classes rather than the outdated `TimeZone` (even though it has catchier name). The modern ones also tend to be more reliable. – Ole V.V. Sep 18 '18 at 07:11
  • Your analysis is outstanding. - It certainly explains where the magical numbers come from. – bvdb Sep 18 '18 at 08:16
  • @OleV.V. I think CET matches GMT+2 this time of year. But in winter it becomes GMT+1. Europeans then change their watches one hour. So, in a way, you could say that it can be 1 of 2 time zones, but only 1 at a time. - That does mean that multiple European countries follow it. e.g. France, Netherlands, Germany, Italy, ..., and Belgium (where I live) for a map: https://www.timetemperature.com/time-zone-maps/europe-time-zone-map.shtml – bvdb Sep 18 '18 at 08:21
  • @bvdb Offsets on 1900-01-01T12:00: Paris +00:09:21, Amsterdam +00:19:32, Berlin +01:00, Rome the same, Brussels +00:00. They are not the same time zone. – Ole V.V. Sep 18 '18 at 08:26
0

Thanks to the answer of VGR I managed to pin down what went wrong internally inside the XMLGregorianCalendarImpl.

Take a look at the constructor:

public XMLGregorianCalendarImpl(GregorianCalendar cal) {

    int year = cal.get(Calendar.YEAR);
    if (cal.get(Calendar.ERA) == GregorianCalendar.BC) {
        year = -year;
    }
    this.setYear(year);

    // Calendar.MONTH is zero based, XSD Date datatype's month field starts
    // with JANUARY as 1.
    this.setMonth(cal.get(Calendar.MONTH) + 1);
    this.setDay(cal.get(Calendar.DAY_OF_MONTH));
    this.setTime(
            cal.get(Calendar.HOUR_OF_DAY),
            cal.get(Calendar.MINUTE),
            cal.get(Calendar.SECOND),
            cal.get(Calendar.MILLISECOND));

    // Calendar ZONE_OFFSET and DST_OFFSET fields are in milliseconds.
    int offsetInMinutes = (cal.get(Calendar.ZONE_OFFSET) + cal.get(Calendar.DST_OFFSET)) / (60 * 1000);
    this.setTimezone(offsetInMinutes);
}

Basically what it does, is map all the internal fields of the GregorianCalendar to internal fields. e.g. year, month, day, hour, minute, second, millisecond.

And finally it maps the time zone. Notice that the GregorianCalendar object stores its time zone in milliseconds, while the XMLGregorianCalendarImpl stores it in minutes. During this conversion it just drops the remaining seconds and milliseconds. And that's the issue, the calendar can actually have a time zone with seconds.

And that brings us to this example of the following date "1900-01-01T12:00:00" in ECT timezone. It actually has a zone offset of 561000 milliseconds. i.e. 9 minutes and 21 seconds. But the xml gregorian just ignores the 21 seconds.

If the time zone has seconds, it's better to output the date without time zone.

private String toXml(Date dts) {

  DatatypeFactory df = DatatypeFactory.newInstance();
  GregorianCalendar gc = new GregorianCalendar();
  gc.setTime(dts);
  XMLGregorianCalendar xc = df.newXMLGregorianCalendar(gc2);

  int zoneOffsetInMillis = gc.get(Calendar.ZONE_OFFSET);
  boolean zoneHasMillis zoneOffsetInMillis % (60 * 1000) != 0;
  if (zoneHasMillis) xc.setTimezone(0);

  return xc.toXMLFormat();
}

Example:

1900-01-01T12:00:00+02:00 becomes 1900-01-01T10:09:21.000

EDIT:

I actually found the historical reason for this 9 minutes and 21 seconds time change:

In Britain, ‘railway time’ was introduced in the 1840s to synchronise local clocks with railway timetables, which was replaced by a single time GMT in 1880. In 1891, France adopted Paris Mean Time as its standard national time. Clocks inside railway stations and train timetables were set five minutes late to prevent passengers missing their trains.

In 1911, Paris Mean Time was altered by 9 minutes 21 seconds to synchronise with Greenwich Mean Time. It was still called Paris Mean Time, which avoided having to use the word “Greenwich”.

source: https://vanessafrance.wordpress.com/2012/03/25/a-brief-history-of-french-time/

Community
  • 1
  • 1
bvdb
  • 22,839
  • 10
  • 110
  • 123