6

Why does sum and average do not work on hours extracted using lubridate, when simple addition using + and division works individually.

Dataset

Category   Time1    Time2       hr1        hr2
1        A 0:30:00 24:00:00    30M 0S  24H 0M 0S
2        B 1:00:00 23:23:00  1H 0M 0S 23H 23M 0S
3        C 2:30:00 23:00:59 2H 30M 0S 23H 0M 59S
4        D 3:00:00 45:00:00  3H 0M 0S  45H 0M 0S

> dput(t1)
structure(list(Category = c("A", "B", "C", "D"), Time1 = c("0:30:00", 
"1:00:00", "2:30:00", "3:00:00"), Time2 = c("24:00:00", "23:23:00", 
"23:00:59", "45:00:00"), hr1 = structure(c(0, 0, 0, 0), year = c(0, 
0, 0, 0), month = c(0, 0, 0, 0), day = c(0, 0, 0, 0), hour = c(0, 
1, 2, 3), minute = c(30, 0, 30, 0), class = structure("Period", package = "lubridate")), 
    hr2 = structure(c(0, 0, 59, 0), year = c(0, 0, 0, 0), month = c(0, 
    0, 0, 0), day = c(0, 0, 0, 0), hour = c(24, 23, 23, 45), minute = c(0, 
    23, 0, 0), class = structure("Period", package = "lubridate"))), .Names = c("Category", 
"Time1", "Time2", "hr1", "hr2"), row.names = c(NA, -4L), class = "data.frame")

Rcode

t1<-read.csv("time1.csv", header=TRUE, sep=",",stringsAsFactors=FALSE)
library(lubridate)
t1$hr1<-hms(t1$Time1)
t1$hr2<-hms(t1$Time2)

#This WORKS

> t1$hr1[4]+t1$hr2[3]
[1] "26H 0M 59S"

> (t1$hr1[4]+t1$hr2[3])/2
[1] "13H 0M 29.5S"

But this doesn't

> sum(t1$hr1[4]+t1$hr2[3])
[1] 59

> mean(t1$hr1[4]+t1$hr2[3])
[1] 59
Shoaibkhanz
  • 1,942
  • 3
  • 24
  • 41
  • 3
    The actual question is "why on earth `+` works?". The reason that `sum` returns `59` is because it is a primitive function that couldn't care less about the fancy classes given to it and just converts everything unfamiliar to a numeric vector. Try `as.numeric(t1$hr1[4])` and `as.numeric(t1$hr2[3])`. `mean` on the other hand is a generic function which happen to not have `mean.lubridate` method and also coerces it to a numeric vector by default. The correct way to use it is `mean(c(t1$hr1[4],t1$hr2[3]))` though. The reason `+` works is probably because it preserves attributes (such as `class`) – David Arenburg Mar 29 '15 at 21:16
  • 2
    perhaps a work around would be to use as.duration, which converts it to seconds : t1<-as.duration(Time1); t2<-as.duration(Time2); t3<-as.duration(sum(t1,t2)) – Ruthger Righart Mar 29 '15 at 21:35

1 Answers1

1

You could use period_to_seconds to convert to seconds, do sum or mean, and then seconds_to_period to convert back to period.

library(lubridate)
t1$hr1 = hms(t1$Time1)
t1$hr2 = hms(t1$Time2)

fsum = function(...){
    seconds_to_period(sum(period_to_seconds(...)))
}
fmean = function(...){
    seconds_to_period(mean(period_to_seconds(...)))
}
tohms = function(...){
    ans = c(...)
    ans@hour = ans@day*24 + ans@hour
    ans@day = 0
    ans
}

t2 = fsum(t1$hr1[4]+t1$hr2[3])
t2
#> [1] "1d 2H 0M 59S"

tohms(t2)
#> [1] "26H 0M 59S"

If you anticipate the sum or mean to exceed a year (roughly 365.25 days), you might have to make further adjustments.


DATA

t1 = structure(list(Category = c("A", "B", "C", "D"),
                    Time1 = c("0:30:00", "1:00:00", "2:30:00", "3:00:00"),
                    Time2 = c("24:00:00", "23:23:00", "23:00:59", "45:00:00")),
               class = "data.frame", row.names = c("1", "2", "3", "4"))
d.b
  • 32,245
  • 6
  • 36
  • 77