1

I'm trying to read some CSV-format financial tick data (source: HISTDATA_COM_ASCII_EURUSD_T_201209.zip) into a zoo series. The data is indexed by a time column which contains timestamps formatted such as 20120902 170010767 - almost like %Y%m%d %H%M%OS3 except milliseconds are not seperated by a decimal point as required by %OS3.

I have attempted to force the required decimal point by dividing the latter (right) half of the timestamp by 1000 and pasting back together again:

FUN <- function(i, format)  {
    sapply(strsplit(i, " "), function(j) strptime(paste(j[1], as.numeric(j[2])/1000), format = format))
}
read.zoo(file, format = "%Y%m%d %H%M%OS3", FUN = FUN, sep = ",")

However, this has not worked - could someone please shed some light on how best to do this properly?

Many thanks

mchen
  • 9,808
  • 17
  • 72
  • 125

1 Answers1

1

You could obviously make this shorter but this gives the idea well:

> tm <- "20120902 170010767"    
> gsub("(^........\\s......)(.+$)", "\\1.\\2", tm)
[1] "20120902 170010.767"
> strptime( gsub("(^........\\s......)(.+$)", "\\1.\\2", tm), "%Y%m%d %H%M%OS")
[1] "2012-09-02 17:00:10.767"
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Beat me to the punch! Here's one shorter expression: `gsub("(\\d{3}$)", ".\\1", tm)` – Josh O'Brien Sep 19 '12 at 00:07
  • I wasn't sure that he would be getting all of the cases with three characters in the 'millisecond' columns, or I might have used a similar "(...$)" (just as few characters and not as many pesky shift keys.) – IRTFM Sep 19 '12 at 00:10
  • Awesome, DWin - works great. But what's the risk with using `gsub("(\\d{3}$)", ".\\1", tm)`? – mchen Sep 19 '12 at 01:02
  • I was not sure that your trailing decimal digits would all be in the right place so I decided to use the space between the date and time as the "hinge point" rather than the end of each character element. – IRTFM Sep 19 '12 at 01:40
  • Thanks. Finally, why `%OS` rather than `%OS3`? – mchen Sep 19 '12 at 09:14
  • Because that's what worked on my OS. Each OS is different in how it handles the milliseconds spec and mine wasn't working properly with "%OS3", yours on the other hand might work well with it. Notice that the `strptime` help page documents using "OSn" for output and only mentions using "OS" with `strptime`. – IRTFM Sep 19 '12 at 14:49