I have a data.table with characters in two columns like so:
01/01/2014 | 00:30
02/01/2014 | 01:00
03/01/2014 | 01:30 etc
The length of this data set varies but is easily over 300,000 rows each time the script is run. Eventually I know this script will need to deal with a data set of 30,000,000 rows plus.
I currently paste
them in the following form:
DT[, DateTime := paste(Date, Time)
Which leads to:
01/01/2014 00:30
02/01/2014 01:00
03/01/2014 01:30 etc
I then use as.POSIXct
to convert that into a POSIX date:
DT[, DateTime:= as.POSIXct(x = DateTime, format = "%d/%m/%Y %H:%M")]
This works fine, converting the characters correctly, largely I believe because I set the format argument to match the structure of the character string it is fed.
However, I'd like to use the fasttime
package, but there is an inherent problem in that it does not support a format
argument to input. Therefore, when I run:
DT[, DateTime := fastPOSIXct(x = DateTime)]
fasttime
has to interpret my data as the "order of interpretation is fixed: year, month, day, hour, minute, second." the output would come out like:
2006/07/07 00:30
2007/07/07 01:00
2008/07/07 01:30 etc
Therfore, it seems I either must use as.POSIXct
, or find a way to manipulate the string into the right order.
What would be the most efficient way to allow me to use fasttime
? How should I reorder the character string to match? Would you expect that it would be worth reordering the character strings in order to use fasttime
, or would the added requirement to correct the strings make fasttime
savings negligible?