3

I want to apply a function to 20 trading days worth of hourly FX data (as one example amongst many).

I started off with rollapply(data,width=20*24,FUN=FUN,by=24). That seemed to be working well, I could even assert I always got 480 bars passed in... until I realized that wasn't what I wanted. The start and end time of those 480 bars was drifting over the years, due to changes in daylight savings, and market holidays.

So, what I want is a function that treats a day as from 22:00 to 22:00 of each day we have data for. (21:00 to 21:00 in N.Y. summertime - my data timezone is UTC, and daystart is defined at 5pm ET)

So, I made my own rollapply function with this at its core:

 ep=endpoints(data,on=on,k=k) 
 sp=ep[1:(length(ep)-width)]+1
 ep=ep[(width+1):length(ep)]
 xx <- lapply(1:length(ep), function(ix) FUN(.subset_xts(data,sp[ix]:ep[ix]),...) )

I then called this with on="days", k=1 and width=20.

This has two problems:

  1. Days is in days, not trading days! So, instead of typically 4 weeks of data, I get just under 3 weeks of data.
  2. The cutoff is midnight UTC. I cannot work out how to change it to use the 22:00 (or 21:00) cutoff.

UPDATE: Problem 1 above is wrong! The XTS endpoints function does work in trading days, not calendar days. The reason I thought otherwise is the timezone issue made it look like a 6-day trading week: Sun to Fri. Once the timezone problem was fixed (see my self-answer), using width=20 and on="days" does indeed give me 4 weeks of data.

(The typically there is important: when there is a trading holiday during those 4 weeks I expect to receive 4 weeks 1 day's worth of data, i.e. always exactly 20 trading days.)

I started working on a function to cut the data into weeks, thinking I could then cut them into five 24hr chunks, but this feels like the wrong approach, and surely someone has invented this wheel before me?

Darren Cook
  • 27,837
  • 13
  • 117
  • 217
  • Not sure how much this helps, but you can subset by time-of-day before `split`ting on days. Let's say end of day is `EOD <- "22:00:00"`. A (not terribly efficient) way to get the endpoints of the days is `dep <- index(do.call(rbind, lapply(split(dat[paste0("T00:00:00/T", EOD)], 'days'), 'last')))` – GSee Aug 23 '12 at 14:25

1 Answers1

1

Here is how to get the daybreak right:

x2=x
index(x2)=index(x2)+(7*3600)
indexTZ(x2)='America/New_York'

I.e. just setting the timezone puts the daybreak at 17:00; we want it to be at 24:00, so add 7 hours on first.

With help from: time zones in POSIXct and xts, converting from GMT in R

Here is the full function:

rollapply_chunks.FX.xts=function(data,width,FUN,...,on="days",k=1){
data <- try.xts(data)

x2 <- data
index(x2) <- index(x2)+(7*3600)
indexTZ(x2) <- 'America/New_York'

ep <- endpoints(x2,on=on,k=k)    #The end point of each calendar day (when on="days").
    #Each entry points to the final bar of the day. ep[1]==0.

if(length(ep)<2){
    stop("Cannot divide data up")
}else if(length(ep)==2){  #Can only fit one chunk in.
    sp <- 1;ep <- ep[-1]
}else{
    sp <- ep[1:(length(ep)-width)]+1
    ep <- ep[(width+1):length(ep)]
}

xx <- lapply(1:length(ep), function(ix) FUN(.subset_xts(data,sp[ix]:ep[ix]),...) )
xx <- do.call(rbind,xx)   #Join them up as one big matrix/data.frame.

tt <- index(data)[ep]  #Implicit align="right". Use sp for align="left"
res <- xts(xx, tt)
return (res)
}

You can see we use the modified index to split up the original data. (If R uses copy-on-write under the covers, then the only extra memory requirement should be for a copy of the index, not of the data.)

(Legal bit: please consider it licensed under MIT, but explicit permission given to use in the GPL-2 XTS package if that is desired.)

Community
  • 1
  • 1
Darren Cook
  • 27,837
  • 13
  • 117
  • 217