Why is there no apply.hourly in R with xts/zoo?

Question

I want to aggregate data by hourly mean. Daily is very easy:

apply.daily(X2,mean)

Why is there no function for hourly? I tried

hr.means <- aggregate(X2, format(X2["timestamp"],"%Y-%m-%d %H"))

and got always error with trim argument. Is there an easier function similar to apply.daily? What if I want to aggregate the mean of 5 minutes. Data are values per minute:

"timestamp", value 
"2012-04-09 05:03:00",2
"2012-04-09 05:04:00",4
"2012-04-09 05:05:00",5
"2012-04-09 05:06:00",0
"2012-04-09 05:07:00",0
"2012-04-09 05:08:00",3
"2012-04-09 05:09:00",0
"2012-04-09 05:10:00",1

I am using xts and zoo.

In my eyes it might be useful to add e.g. apply.hourly and apply.minutely to packages! — Herr Student, Apr 15 '13 at 16:06

score 17 · Accepted Answer · answered Apr 15 '13 at 16:02

17

try

period.apply(X2, endpoints(X2, "hours"), mean)

apply.daily is simply a wrapper for the above:

> apply.daily
function (x, FUN, ...)
{
    ep <- endpoints(x, "days")
    period.apply(x, ep, FUN, ...)
}

answered Apr 15 '13 at 16:02

eddi

49,088
6
104
155

Thank you so far, any suggestions for the periodicity of 5 minutes? – Herr Student Apr 15 '13 at 16:09
1

I get: 05:04:00 - 4; 05:09:00 - 5,... but maybe it is possible to set the first value to 05:00:00 and go on with 05:05:00 might be easier, if I am merging files later to have the same start and timestep. Thanks – Herr Student Apr 15 '13 at 17:14
I see. You can probably fiddle around with the output of `endpoints`, or just look at what `endpoints` prints out (it's just a list of indices) and then construct a similar list with whatever points you like instead. – eddi Apr 15 '13 at 18:06

score 2 · Answer 2 · edited Jul 07 '15 at 14:53

2

hr.means <- aggregate(X2, format(time(X2),"%y-%m-%d %H"), mean)

This should work fine.

edited Jul 07 '15 at 14:53

rink.attendant.6

44,500
61
101
156

answered Jul 07 '15 at 14:35

athlonshi

1,711
1
19
23

score 1 · Answer 3 · edited Apr 13 '17 at 12:44

Answering part 2:

What if I want to aggregate the mean of 5 minutes?

As @eddit already mentioned in a comment above:

df <- read.table(header=TRUE, sep=",", stringsAsFactors=FALSE, text="
timestamp, value 
2012-04-09 05:03:00,2
2012-04-09 05:04:00,4
2012-04-09 05:05:00,5
2012-04-09 05:06:00,0
2012-04-09 05:07:00,0
2012-04-09 05:08:00,3
2012-04-09 05:09:00,0
2012-04-09 05:10:00,1")
X2 <- xts(df$value, as.POSIXct(df$timestamp))

X2.5min <- period.apply(X2, endpoints(X2, "minutes", 5), mean)

I get: 05:04:00 - 4; 05:09:00 - 5,... but maybe it is possible to set the first value to 05:00:00 and go on with 05:05:00 might be easier, if I am merging files later to have the same start and timestep.

Indeed:

> X2.5min
                    [,1]
2012-04-09 05:04:00  3.0
2012-04-09 05:09:00  1.6
2012-04-09 05:10:00  1.0

Darren Cook over at Cross Validated faced the same issue and wrote function align.time.down:

align.time.down=function(x,n){index(x)=index(x)-n;align.time(x,n)}

That can be used to adjust the times down:

X2.5mindown <- align.time.down(X2.5min, 5 * 60)
X2.5mindown
                    [,1]
2012-04-09 05:00:00  3.0
2012-04-09 05:05:00  1.6
2012-04-09 05:10:00  1.0

As I commented to Darren, it seems like a bad idea to align observations with a time *prior* to their occurrence, especially if you're going to merge the output of the aggregation with another time series. — Joshua Ulrich, Aug 21 '15 at 23:39

Why is there no apply.hourly in R with xts/zoo?

3 Answers3

Linked