13

Assuming I have this schema:

create table rental (
    id           integer,
    rental_date  timestamp,
    customer_id  smallint,
    return_date  timestamp,
);

Running this query returns strange results:

select customer_id, avg(return_date - rental_date) as "avg"
from rental
group by customer_id
order by "avg" DESC

It displays:

customer_id|avg_rent_duration     |
-----------|----------------------|
        315|     6 days 14:13:22.5|
        187|5 days 34:58:38.571428|
        321|5 days 32:56:32.727273|
        539|5 days 31:39:57.272727|
        436|       5 days 31:09:46|
        532|5 days 30:59:34.838709|
        427|       5 days 29:27:05|
        555|5 days 26:48:35.294118|
...

599 rows

Why is there values like 5 days 34:58:38, 5 days 32:56:32 and so on? I thought there where only 24 hours in a day, maybe I'm wrong.

EDIT

Demo here: http://sqlfiddle.com/#!17/caa7a/1/0

Sample data:

insert into rental (rental_date, customer_id, return_date)
values
('2007-01-02 13:10:06', 1, '2007-01-03 01:01:01'),
('2007-01-02 01:01:01', 1, '2007-01-09 15:10:06'),
('2007-01-10 22:10:06', 1, '2007-01-11 01:01:01'),
('2007-01-30 01:01:01', 1, '2007-02-03 22:10:06');
rap-2-h
  • 30,204
  • 37
  • 167
  • 263
  • 9
    you can use `justify_interval()` to normalize the interval –  Sep 06 '19 at 12:48
  • 5
    @a_horse_with_no_name Thank you, it seems to work! Could you please elaborate (how/why) and post your comment as an answer? – rap-2-h Sep 06 '19 at 12:52
  • @LaurenzAlbe The issue isn't that the difference isn't a round number of days. It's that the number of hours often exceeds 24, as in `5 days 34:58:38`; the question is why that's not `6 days 10:58:38`. – elixenide Sep 06 '19 at 15:00
  • I strongly suspect this has to do with the use of `avg`; the subtraction may be correct, but `avg` causes problems. I'm not certain of this, though. – elixenide Sep 06 '19 at 15:01
  • Oh, I see. Thanks. – Laurenz Albe Sep 06 '19 at 15:07

2 Answers2

8

You have to adjust interval with justify_interval() function:

select customer_id, justify_interval(avg(return_date - rental_date)) as "avg"
from rental
group by customer_id
order by "avg" DESC;

See official doc:

Adjust interval using justify_days and justify_hours, with additional sign adjustments

Still, it does not explains why the result of the operation is that weird without using justify_interval() (in other words, why we have to apply this function)

Note: thanks to @a_horse_with_no_name for their comment

rap-2-h
  • 30,204
  • 37
  • 167
  • 263
5

This is an attempt to explain the behavior.

PostgreSQL intervals don't get “justified” more than necessary during interval arithmetic. I'd say that has two reasons:

  • speed
  • loss of exactness (e.g. when days are being converted by months, assuming a month to have 30 days

So you get results like this:

SELECT INTERVAL '1 day 20 hours' + INTERVAL '5 days 30 hours';

    ?column?     
-----------------
 6 days 50:00:00
(1 row)

The same holds for division:

SELECT INTERVAL '6 days 50 hours' / 2;

    ?column?     
-----------------
 3 days 25:00:00
(1 row)

Always adjusting the hours to be less than 24 would make a long calculation like the avg you are calculating unnecessarily complicated, and as you have found, there are functions to adjust the result.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263