2

When using java.time.Period.between() across months of varying lengths, why does the code below report different results depending on the direction of the operation?

import java.time.LocalDate;
import java.time.Period;
class Main {
  public static void main(String[] args) {
    LocalDate d1 = LocalDate.of(2019, 1, 30);
    LocalDate d2 = LocalDate.of(2019, 3, 29); 
    Period period = Period.between(d1, d2);
    System.out.println("diff: " + period.toString());
    // => P1M29D
    Period period2 = Period.between(d2, d1);
    System.out.println("diff: " + period2.toString());
    // => P-1M-30D
  }
}

Live repl: https://repl.it/@JustinGrant/BigScornfulQueryplan#Main.java

Here's how I'd expect it to work:

2019-01-30 => 2019-03-29

  1. Add one month to 2019-01-30 => 2019-02-30, which is constrained to 2019-02-28
  2. Add 29 days to get to 2019-03-29

This matches Java's result: P1M29D

(reversed) 2019-03-29 => 2019-01-30

  1. Subtract one month from 2019-03-29 => 2019-02-29, which is constrained to 2019-02-28
  2. Subtract 29 days to get to 2019-01-30

But Java returns P-1M-30D here. I expected P-1M-29D.

The reference docs say:

The period is calculated by removing complete months, then calculating the remaining number of days, adjusting to ensure that both have the same sign. The number of months is then split into years and months based on a 12 month year. A month is considered if the end day-of-month is greater than or equal to the start day-of-month. For example, from 2010-01-15 to 2011-03-18 is one year, two months and three days.

Maybe I'm not reading this carefully enough, but I don't think this text fully explains the divergent behavior that I'm seeing.

What am I misunderstanding about how java.time.Period.between is supposed to work? Specifically, what is expected to happen when the intermediate result of "removing complete months" is an invalid date?

Is the algorithm documented in more detail elsewhere?

Justin Grant
  • 44,807
  • 15
  • 124
  • 208

1 Answers1

1

TL;DR

The algorithm I see in the source (copied below) does not seem to assume that a Period between two dates is expected to have the accuracy that the number of days between the same two dates would (I even suspect Period is not meant to be used in calculations on continuous time variables).

It computes the difference in months and days, then adjusts to make sure both have the same sign. The resulting period is built on the grounds of these two values.

The main challenge is that adding two months to LocalDate.of(2019, 1, 28) is not the same thing as adding (31 + 28) days or (28 + 31) days to that date. It's simply adding 2 months to LocalDate.of(2019, 1, 28), which gives LocalDate.of(2019, 3, 28).

In other words, in the context of LocalDate, Periods represent an accurate number of months (and derived years), but days are sensitive to the lengths of months they're computed into.


This is the source I'm seeing (java.time.LocalDate.until(ChronoLocalDate) is ultimately doing the job):

public Period until(ChronoLocalDate endDateExclusive) {
    LocalDate end = LocalDate.from(endDateExclusive);
    long totalMonths = end.getProlepticMonth() - this.getProlepticMonth();  // safe
    int days = end.day - this.day;
    if (totalMonths > 0 && days < 0) {
        totalMonths--;
        LocalDate calcDate = this.plusMonths(totalMonths);
        days = (int) (end.toEpochDay() - calcDate.toEpochDay());  // safe
    } else if (totalMonths < 0 && days > 0) {
        totalMonths++;
        days -= end.lengthOfMonth();
    }
    long years = totalMonths / 12;  // safe
    int months = (int) (totalMonths % 12);  // safe
    return Period.of(Math.toIntExact(years), months, days);
}

As can be seen, the sign adjustment is made when the month difference has a different sign from the day difference (and yes, they're computed separately). Both totalMonths > 0 && days < 0 and totalMonths < 0 && days > 0 are applicable in your examples (one to each calculation).

It just happens that when the period difference in months is positive, the period's day is computed using epoch days, thus producing an accurate result. It would still be potentially affected when there's necessity to clip the new end date to fit into the month length - such as in:

jshell> LocalDate.of(2019, 1, 31).plusMonths(1)
$42 ==> 2019-02-28

But this can't happen in your example because you simply can't supply an invalid end date to the method, as in

// Period.between(LocalDate.of(2019, 1, 31), LocalDate.of(2019, 2, 30)

for the resulting number of days in the resulting period to be clipped.

When the time difference in months is negative, however, it happens:

//task: account for the 1-day difference
jshell> LocalDate.of(2019, 5, 30).plusMonths(-1)
$50 ==> 2019-04-30

jshell> LocalDate.of(2019, 5, 31).plusMonths(-1)
$51 ==> 2019-04-30

And, using periods and local dates:

jshell> Period.between(LocalDate.of(2019, 3, 31), LocalDate.of(2019, 2, 28))
$39 ==> P-1M-3D //3 days? It didn't look at February's length (2<3 && 28<31)

jshell> Period.between(LocalDate.of(2019, 3, 31), LocalDate.of(2019, 1, 31))
$40 ==> P-2M

In your case (second call), -30 is the result of (30 - 29) - 31, where 31 is the number of days in January.

I think the short story here is not to use Period for time value calculations. In the context of time, I suppose month is a notional unit. Periods will work well when a month is defined as an abstract period (such as in calculations of monthly rent payments), but they'll usually fail when it comes to continuous time.

ernest_k
  • 44,416
  • 5
  • 53
  • 99
  • 1
    This is very helpful info, thanks! Do you know if there is a formal specification that defines behavior of this case? (I'm not very familiar with Java.) I'm curious about why asymmetrical behavior was chosen here. BTW, I'm asking because I'm helping to define how the next generation of JavaScript will handle this calculation and I'm unconvinced that different behavior in positive vs. negative cases is wise. But before committing the entire JS ecosystem to a symmetrical solution I'd like to understand why Java chose to do it differently. – Justin Grant Oct 17 '20 at 04:20
  • 2
    @JustinGrant I unfortunately don't see it clearly documented, and would love to see other/better answers by those closer to the design of this API. Problem isn't with `Period` as much as it is with `Period` when used in operations on dates; but there should be documentation at this junction. Before looking in depth, it's reasonable to expect the behavior to be symmetrical. But I wouldn't say they "chose" it to be asymmetrical, as adding a number of months to a date changes in effect following calendar months' caprices. – ernest_k Oct 17 '20 at 05:18
  • 1
    During the development of JSR-310, a thread was started pointing out this issue, but the thread was never resolved. I'm struggling to see any justification for the discrepancy ATM, so its probably a bug, but one that may be unfixable at this point. As such, I think JS should adopt a symmetric approach, ensuring that `date.plus(period)` works equivalently to date.plus(period.totalMonths).plus(period.days), ie. two separate steps. – JodaStephen Oct 17 '20 at 23:15
  • @JodaStephen thanks for the insight. This is possibly just my ignorance, but I'm still not seeing how that will solve it, as `date.plus(x.toTotalMonths())` can already not be *reliably* inverted with `date.plus(x.toTotalMonths()).plus(-x.toTotalMonths())` - and this not `Period`'s fault. – ernest_k Oct 18 '20 at 07:20
  • As you say, there is no perfect answer to the problem because months are variable length. The question is which edge cases are you willing to throw under the bus. It is reasonable to aim for `start.plus(Period.between(start,end) == end`. Beyond that, all bets are off. Try lots of examples, with leap/non-leap year and position of Feb to get all sorts of strange results. For example, try 2020-01-30 to 2020-03-29 and 2019-12-30 to 2020-03-29 and you'll find JSR-310's algorithm is correct in both directions. – JodaStephen Oct 19 '20 at 09:07
  • As such, I suggest that this should always be true: `start.plus(Period.between(start,end) == end` and `start.plusMonths(period.months).plusDays(period.days) == end`. This is not symmetric, and JSR-310 does not comply either. – JodaStephen Oct 19 '20 at 09:08