0

I have this file that contains three .csv files, EURAUD_201501, EURAUD_201502, EURAUD_201503. The files contain forex tick-data for from January to March 2015. The first step of my exercise is to manipulate the data-set into a functional form. The (working) code is:

#Entering and editing data by hand#

data1<-read.csv("EURAUD_201501.csv", header = FALSE, col.names = c("TIMESTAMP", "BID", "OFR", "VOL"), stringsAsFactors = FALSE)
data1$VOL<- NULL  #drops the VOL column
data1$TIMESTAMP = sub( '(?<=.{11})', ':', data1$TIMESTAMP, perl=TRUE ) #manipulate the strings to create clear timestamps
data1$TIMESTAMP = sub( '(?<=.{14})', ':', data1$TIMESTAMP, perl=TRUE )
data1$TIMESTAMP = sub( '(?<=.{17})', '.', data1$TIMESTAMP, perl=TRUE )
xts_data1 = xts(data1[,c(2,3)], order.by = as.POSIXct(data1$TIMESTAMP, tz = "EST", format = "%Y%m%d %H:%M:%OS")) #Convert file to an xts object
rm(data1) #remove data1 object in order to save space

data2<-read.csv("EURAUD_201502.csv", header = FALSE, col.names = c("TIMESTAMP", "BID", "OFR", "VOL"), stringsAsFactors = FALSE)
data2$VOL<- NULL  #drops the VOL column
data2$TIMESTAMP = sub( '(?<=.{11})', ':', data2$TIMESTAMP, perl=TRUE ) #manipulate the strings to create clear timestamps
data2$TIMESTAMP = sub( '(?<=.{14})', ':', data2$TIMESTAMP, perl=TRUE )
data2$TIMESTAMP = sub( '(?<=.{17})', '.', data2$TIMESTAMP, perl=TRUE )
xts_data2 = xts(data2[,c(2,3)], order.by = as.POSIXct(data2$TIMESTAMP, tz = "EST", format = "%Y%m%d %H:%M:%OS")) #Convert file to an xts object
rm(data2) #remove data2 object in order to save space

data3<-read.csv("EURAUD_201503.csv", header = FALSE, col.names = c("TIMESTAMP", "BID", "OFR", "VOL"), stringsAsFactors = FALSE)
data3$VOL<- NULL  #drops the VOL column
data3$TIMESTAMP = sub( '(?<=.{11})', ':', data3$TIMESTAMP, perl=TRUE ) #manipulate the strings to create clear timestamps
data3$TIMESTAMP = sub( '(?<=.{14})', ':', data3$TIMESTAMP, perl=TRUE )
data3$TIMESTAMP = sub( '(?<=.{17})', '.', data3$TIMESTAMP, perl=TRUE )
xts_data3 = xts(data3[,c(2,3)], order.by = as.POSIXct(data3$TIMESTAMP, tz = "EST", format = "%Y%m%d %H:%M:%OS")) #Convert file to an xts object
rm(data3) #remove data3 object in order to save space

#Create 5-minute intervals
final_xts  = rbind.xts(xts_data1, xts_data2, xts_data3)
rm(data1_xts, data2_xts, data3_xts)
final_fivemin = aggregatets(final_xts, FUN = "previoustick", on = "minutes", k = 5)

How can I create a functional loop without having to repeat the same procedure for each data-set?

Greconomist
  • 386
  • 2
  • 15

1 Answers1

2

Seems like you might want to try lapply. You could replace the for loop with

xts_data <- lapply(real_data, function(x){
    data <- read.csv(x, header = FALSE, col.names = c("TIMESTAMP", "BID", "OFR", "VOL"), 
        stringsAsFactors = FALSE)
    data$VOL<- NULL  #drops the VOL column
    data$TIMESTAMP = sub( '(?<=.{11})', ':', data[i]$TIMESTAMP, perl=TRUE ) #manipulate the strings to create clear timestamps
    data$TIMESTAMP = sub( '(?<=.{14})', ':', data[i]$TIMESTAMP, perl=TRUE )
    data$TIMESTAMP = sub( '(?<=.{17})', '.', data[i]$TIMESTAMP, perl=TRUE )
    return(xts(data[,c(2,3)], 
        order.by = as.POSIXct(data$TIMESTAMP, tz = "EST", format = "%Y%m%d %H:%M:%OS"))) 
        #Convert file to an xts object
})

And then finish up:

#Create 5-minute intervals
final_xts  = do.call(rbind, xts_data)
final_fivemin = aggregatets(final_xts, FUN = "previoustick", on = "minutes", k = 5)
Weihuang Wong
  • 12,868
  • 2
  • 27
  • 48
  • Thank you for your answer. I think you are on the right direction but we need a little bit more effort to achieve the desirable outcome. If your run my first functional piece of code, it creates matrices [ data_frames is the appropriate expression I reckon]. I run your code but It does not return what I want unfortunately. I'll edit my initial question to make it more clear. – Greconomist Jul 28 '16 at 23:43
  • I don't have your data, so I can't run your code. But `lapply` returns a list; to get a dataframe, you can just do `final_xts <- do.call(rbind, xts_data)` after the above code chunk. – Weihuang Wong Jul 29 '16 at 12:57
  • If you click the 'I have **this file**' at the top of my post you can download the data. – Greconomist Jul 29 '16 at 12:59