1

I am reading Excel Workbooks (.xlsx format) with read.xlsx() from the xlsx package. My Workbooks have dates stored as functions (e.g., =DATE(1900, 1, 1)) rather than strings or fractional days from 1899-12-31 (e.g., 1/1/1900 or 1).

This is not a problem for read.xlsx if the Workbook has ever been opened/evaluated. But many of these Workbooks have never been opened/evaluated (they're generated by my data provider, not by Excel). Because the Workbook has not been opened/evaluated it contains -1 rather than the actual fractional number of days since 1899-12-31 (e.g., 37036 for May 25, 2001). I learned this here. So if I read a date in one of these (never opened) Workbooks, I get a column of -1, which read.xlsx interprets as 1899-12-30 (i.e., one day before 1899-12-31).

Is there a fix here? Can I touch a Workbook to force evaluation? Is there a bit that I can flip in read.xlsx to force evaluation first? I see only a colClasses argument, which doesn't seem to be the problem? Can I avoid using an Excel VBA script to write these Workbooks to text? Sorry I can't provide any reproducible code, but if you open an Excel Workbook to enter =Date(), the problem won't exist.

Cœur
  • 37,241
  • 25
  • 195
  • 267
Richard Herron
  • 9,760
  • 12
  • 69
  • 116
  • In case you cannot find a way to do this using `read.xlsx` the brute force way might be to open and save the files one by one programmatically using the `rcom` or `RDCOMClient` package. – Mark Heckmann Jan 09 '14 at 12:04

1 Answers1

1

Using XLConnect package , You can use :

getForceFormulaRecalculation(object,sheet)

to compute for all sheets :

getForceFormulaRecalculation(wb, sheet = "*")
agstudy
  • 119,832
  • 17
  • 199
  • 261
  • Thanks for the pointer to the `XLConnect` package. `readWorksheet` got the dates correct (i.e., forced evaluation first) and was faster for my use case. – Richard Herron Jan 09 '14 at 15:46