0

We have a csv file with Dates in Excel format and Nav for Manager A and Manager B as follows:

Date,Manager A,Date,Manager B
41346.6666666667,100,40932.6666666667,100
41347.6666666667,100,40942.6666666667,99.9999936329992
41348.6666666667,100,40945.6666666667,99.9999936397787
41351.6666666667,100,40946.6666666667,99.9999936714362
41352.6666666667,100,40947.6666666667,100.051441180137
41353.6666666667,100,40948.6666666667,100.04877283951
41354.6666666667,100.000077579585,40949.6666666667,100.068400298752
41355.6666666667,100.00007861475,40952.6666666667,100.070263374822
41358.6666666667,100.000047950872,40953.6666666667,99.9661095940006
41359.6666666667,99.9945012295984,40954.6666666667,99.8578245935173
41360.6666666667,99.9944609274138,40955.6666666667,99.7798031949116
41361.6666666667,99.9944817907402,40956.6666666667,100.029523604978
41366.6666666667,100,40960.6666666667,100.14859511024
41367.6666666667,99.4729804387476,40961.6666666667,99.7956029017769
41368.6666666667,99.4729804387476,40962.6666666667,99.7023420799123
41369.6666666667,99.185046151864,40963.6666666667,99.6124531927299
41372.6666666667,99.1766469096966,40966.6666666667,99.5689030038018
41373.6666666667,98.920738006398,40967.6666666667,99.5701493637685
,,40968.6666666667,99.4543885041996
,,40969.6666666667,99.3424528379521

We want to create a zoo object with the following structure [Dates, Manager A Nav, Manager B Nav].

After reading the csv file with:

data = read.csv("...", header=TRUE, sep=",")

we set an index for splitting the object and use lapply to split

INDEX <- seq(1, by = 2, length = ncol(data) / 2)

data.zoo <- lapply(INDEX, function(i, data) data[i:(i+1)], data = zoo(data))

I'm stuck with the fact that Dates are in Excel format and don't know how to fix that stuff. Is the problem set in a correct way?

Lorenzo Rigamonti
  • 1,705
  • 8
  • 25
  • 36
  • I don't have the answer, but I have two thoughts. It's possible you've already thought of these ideas. When I save a date from Excel into a csv, the csv contains a string representing the date, rather a number. The format of the string depends on how I format the cells to display the dates in Excel. Maybe you can get csv output that is more readily readable as dates by R. My second thought is that Excel's numeric format has to follow a pretty simple set of rules, which are probably publicly available, and you could write a custom function to convert to a respresentation R understands. – Mars Apr 10 '13 at 15:45
  • The problem is that the csv file is generated internally, it comes from other sources. – Lorenzo Rigamonti Apr 10 '13 at 15:47
  • I thought that might be the case. – Mars Apr 10 '13 at 15:58

1 Answers1

1

If all you want to do is to convert the dates to proper dates you can do this easily enough. The thing you need to know is the origin date. Your numbers represent the integer and fractional number of days that have passed since the origin date. Usually this is Jan 0 1990!!! Go figure, but be careful as I don't think this is always the case. You can try this...

# Excel origin is day 0 on Jan 0 1900, but treats 1900 as leap year so...
data$Date <- as.Date( data$Date , origin = "1899/12/30")
data$Date.1 <- as.Date( data$Date.1 , origin = "1899/12/30")
# For more info see ?as.Date

If you are interested in keeping the times as well, you can use as.POSIXct, but you must also specify the timezone (UTC by default);

data$Date <- as.POSIXct(data$Date, origin = "1899/12/30" )
head(data)
#                    Date Manager.A     Date.1 Manager.B
#   1 2013-03-13 16:00:00       100 2012-01-24 100.00000
#   2 2013-03-14 16:00:00       100 2012-02-03  99.99999
#   3 2013-03-15 16:00:00       100 2012-02-06  99.99999
#   4 2013-03-18 16:00:00       100 2012-02-07  99.99999
#   5 2013-03-19 16:00:00       100 2012-02-08 100.05144
#   6 2013-03-20 16:00:00       100 2012-02-09 100.04877
Simon O'Hanlon
  • 58,647
  • 14
  • 142
  • 184