It only makes mathematical sense to take the AVG() of a numeric value, not datetime values or durations. Since you want your answer to be in minutes precision, you want to get your difference in minutes, then convert back to days, hours, minutes. (There are 24*60=1440 minutes in a standard day.)
with q as
(select avg(
timestampdiff(4, char(endDate - startDate) )
) as avgmns
from yourChosenData
)
select int(avgmns / 1440) as avg_days,
int( mod(avgmns,1440) / 60) as avg_mins,
mod(avgmns, 60) as avg_secs
from q
As mentioned below, timestampdiff() is an estimate. To avoid this issue, one could use a more accurate calculation.
with q as
(select avg(
( days(endDate) - days(startDate) ) * 1440
+ ( midnight_seconds(endDate) - midnight_seconds(startDate) ) / 60
) as avgmns
from yourChosenData
)
select int(avgmns / 1440) as avg_days,
int( mod(avgmns,1440) / 60) as avg_mins,
mod(avgmns, 60) as avg_secs
from q
In order to address the DST issue, if needed, one might choose either of:
- include a UTC offset column corresponding to each timestamp field. This would also be useful if timstamps were being recorded in more than one timezone. The diference in offsets could then be fed into the calculation along with the timestamps.
- provide a deterministic UDF which could return a UTC or DST adjustment offset for a given timestamp. If multiple timezones are involved, then the zone should also be a parameter to the function. Depending on the geographic areas involved, the logic may also need to consider areas which observe alternative DST rules.