1

The highfrequency package has been created in a way to transform .txt and .csv files from the NYSE TAQ and WRDS TAQ respectively into .RData files of xts objects, which then can be easily manipulated through the package.

The problem is that I have limited access to the WRDS database which only enables me to download tick-data from the CRSP (The Center for Research in Security Prices) database but not the TAQ (Trades and Quotes) database. So my data look like this. The downloadable file contains tick-data for the REIT index from 2014-01-01 to 2014-01-05. I changed manually the ticker header for the header PRICE as it is proposed by Kris Boudt, one of the main authors.

The code that I use is the following:

 from="2014-03-01"
 to="2014-04-31"
 datasource="C:/Users/aris/Desktop/raw_data"
 datadestination="C:/Users/aris/Desktop/xts_data"
 convert(from = from,to=to,datasource = datasource,datadestination = datadestination,
 trades=TRUE,quotes=FALSE,ticker="REIT",dir=FALSE,extension="csv",header = TRUE,
 tradecolnames = NULL, quotecolnames = NULL,format = "%Y%m%d %H:%M:%S",onefile=TRUE)

I suspect that the problem lies at the line format = "%Y%m%d %H:%M:%S", as at the .csv file the date and the time are comma separated. I tried to put a comma between %d and %H like this format = "%Y%m%d,%H:%M:%S" but nothing.

The error reads

 Error in `$<-.data.frame`(`*tmp*`, "COND", value = numeric(0)) :   
 replacement has 0 rows, data has 1048575

All the suggestions are welcomed.

Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
Greconomist
  • 386
  • 2
  • 15

2 Answers2

1

Thanks to Joshua Ulrich I was able to gain some additional intuition and solve the problem(s). Actually, there is no need to manipulate the .csv file itself and add extra columns. Instead of setting tradecolnames = NULL you let the machine know which columns are contained into your file by setting tradecolnames = c("DATE","TIME","PRICE"). The problem with the non-existent directories is fixed by setting dir=TRUE . The final code looks like this:

from="2014-03-01" 
to="2014-04-31"
datasource="C:/Users/aris/Desktop/raw_data"
datadestination="C:/Users/aris/Desktop/xts_data" 
convert(from,to,datasource,datadestination,trades=TRUE,quotes=FALSE,ticker="REIT",dir=TRUE,extension="csv",header= TRUE,tradecolnames=c("DATE","TIME","PRICE"),format = "%Y%m%d %H:%M:%S",onefile=TRUE)
Greconomist
  • 386
  • 2
  • 15
0

The highfrequency::convert function calls highfrequency:::makeXtsTrades, which expects the following columns in your text file: DATE,TIME,PRICE,SIZE,SYMBOL,EX,COND,CORR,G127.

I added empty columns to your text file, and did not get the error in your question. The edited text file looks like:

DATE,TIME,PRICE,SIZE,SYMBOL,EX,COND,CORR,G127
20140102,9:30:00,1123.77,,,,,,
20140102,9:30:01,1122.81,,,,,,
20140102,9:30:02,1122.77,,,,,,

I got another error though.

Error in gzfile(file, "wb") : cannot open the connection
In addition: Warning message:
In gzfile(file, "wb") :
  cannot open compressed file '/home/josh/Desktop/z_xts/2014-01-02/REIT_trades.RData', probable reason 'No such file or directory'

So it looks like the convert function expects all the daily output directories to exist before you run it. The function runs and creates the output after I create those directories.

Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
  • Thank you very much for your time. Excellent suggestions. – Greconomist Jul 14 '16 at 21:54
  • Hi Joshua, sorry I am being cheeky here :P I've seen that you've raised issues on `highfrequency` github and you've got experienc with package as well, I've got a [very peculiar one](https://stackoverflow.com/questions/73190283/highfrequency-r-package-issue-with-timezone-in-stock-quotes-data-clean-up) with this package as well, and given you are based in US it may well be very easy for you to diagnose. When you are bored, it's much appreciated if you could take a look. Thanks! – stucash Aug 01 '22 at 07:40