23

Assume this table:

id    date
----------------
1     2010-12-12
2     2010-12-13
3     2010-12-18
4     2010-12-22
5     2010-12-23

How do I find the average intervals between these dates, using MySQL queries only?

For instance, the calculation on this table will be

  (
    ( 2010-12-13 - 2010-12-12 )
  + ( 2010-12-18 - 2010-12-13 )
  + ( 2010-12-22 - 2010-12-18 )
  + ( 2010-12-23 - 2010-12-22 )
  ) / 4
----------------------------------
= ( 1 DAY + 5 DAY + 4 DAY + 1 DAY ) / 4
= 2.75 DAY
bluish
  • 26,356
  • 27
  • 122
  • 180
HyderA
  • 20,651
  • 42
  • 112
  • 180
  • 2
    This'll be _much_ easier without doing it in SQL. Why do you need it to be done in SQL? – jwueller Nov 30 '10 at 09:07
  • 1
    Because it sounds like a fun thing to do. I've already implemented this with my PHP function. But I'm curious to see if it can be done with MySQL without a significant performance hit. – HyderA Nov 30 '10 at 09:10
  • first of all you need to calculate diff between 2 rows look on : http://stackoverflow.com/questions/3017468 – Haim Evgi Nov 30 '10 at 09:14
  • 2
    Of course you do realize that you are calculating the average wrongly? If you have 4 intervals in the example, then dividing by 5 gives you something that is not an average length of an interval. – Unreason Nov 30 '10 at 09:43
  • I meant dividing by the number of intervals, in this case, 5. C'mon dude, we learnt averaging back in what, the 5th grade. – HyderA Nov 30 '10 at 09:50
  • 1
    You only have 4 intervals, "dude". – Dmitri Nov 30 '10 at 10:05
  • 1
    In this case it's 4, not 5. Unreason was right :) – mike Nov 30 '10 at 10:05
  • =D - My bad! I was looking at the number of dates, not intervals. – HyderA Nov 30 '10 at 10:17

4 Answers4

46

Intuitively, what you are asking should be equivalent to the interval between the first and last dates, divided by the number of dates minus 1.

Let me explain more thoroughly. Imagine the dates are points on a line (+ are dates present, - are dates missing, the first date is the 12th, and I changed the last date to Dec 24th for illustration purposes):

++----+---+-+

Now, what you really want to do, is evenly space your dates out between these lines, and find how long it is between each of them:

+--+--+--+--+

To do that, you simply take the number of days between the last and first days, in this case 24 - 12 = 12, and divide it by the number of intervals you have to space out, in this case 4: 12 / 4 = 3.

With a MySQL query

SELECT DATEDIFF(MAX(dt), MIN(dt)) / (COUNT(dt) - 1) FROM a;

This works on this table (with your values it returns 2.75):

CREATE TABLE IF NOT EXISTS `a` (
  `dt` date NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

INSERT INTO `a` (`dt`) VALUES
('2010-12-12'),
('2010-12-13'),
('2010-12-18'),
('2010-12-22'),
('2010-12-24');
Vegard Larsen
  • 12,827
  • 14
  • 59
  • 102
  • 2
    D'oh, you're right! If you only have timestamps and not start/end pairs, there is no need for any aggregation shenanigans. – Dmitri Nov 30 '10 at 10:27
  • 4
    +1 for not following a pattern and seeing the problem for what it really is. :D – Unreason Nov 30 '10 at 10:39
  • 1
    You sir, are a genius! This solution is the perfect example of achieving simplicity by thinking outside the box. – HyderA Nov 30 '10 at 10:40
  • 1
    Nice! And what if the difference is only a couple of seconds between the entry? My result currently is `0,0000` so there seems to be a precision problem. EDIT: found `TIMEDIFF` for getting the number of seconds. – Rvanlaak Sep 29 '15 at 12:16
3

If the ids are uniformly incremented without gaps, join the table to itself on id+1:

SELECT d.id, d.date, n.date, datediff(d.date, n.date)
FROM dates d
JOIN dates n ON(n.id = d.id + 1)

Then GROUP BY and average as needed.

If the ids are not uniform, do an inner query to assign ordered ids first.

I guess you'll also need to add a subquery to get the total number of rows.

Alternatively

Create an aggregate function that keeps track of the previous date, and a running sum and count. You'll still need to select from a subquery to force the ordering by date (actually, I'm not sure if that's guaranteed in MySQL).

Come to think of it, this is a much better way of doing it.

And Even Simpler

Just noting that Vegard's solution is much better.

Dmitri
  • 8,999
  • 5
  • 36
  • 43
1

The following query returns correct result

SELECT AVG(
        DATEDIFF(i.date, (SELECT MAX(date) 
                          FROM intervals WHERE date < i.date)
                 )
           )
FROM intervals i

but it runs a dependent subquery which might be really inefficient with no index and on a larger number of rows.

Unreason
  • 12,556
  • 2
  • 34
  • 50
0

You need to do self join and get differences using DATEDIFF function and get average.