1

I am trying to import date fields which are formatted as a character, into a POSIXlt class while importing the data itself. Instead of importing the date as character and converting to POSIXlt later, I want to do this in one step, since my data is quite big (4+ GB). Below is the code for the same.

In this code, I define a new 'myDate' class and using it to pass into the colClasses argument. This method works with read.delim, but not on fread from data.table pkg.

It would be great if this is made to work with fread because, I will save at least 20 Min of time with fread method, if this works. Can you make it to work with fread or suggest a better alternative ?

library (data.table)
library (lubridate)
setClass ('myDate')

# create custom 'myDate' class. Having "Date" or "POSIXlt" in colClasses argument of read.table does not work
setAs ("character","myDate", function(from) as.POSIXlt(fast_strptime(from, "%Y-%m-%d")) )   

ccs <- c ("logical", "myDate", "myDate", "character") 

# fread() Does NOT Work 
rawdata <- fread ("filepath/filename.txt", colClasses=css, sep='|', header=FALSE,)) 

# WORKS!
rawdata <- read.delim ("filepath/filename.txt", colClasses=ccs, sep='|', header=FALSE) 
Selva
  • 2,045
  • 1
  • 23
  • 18
  • word of caution, regardless of how you read the data, `data.table` will not play with `POSIXlt` – eddi Oct 20 '14 at 16:19
  • Can you explain more, when you say `data.table` will not play with `POSIXlt` ? – Selva Oct 22 '14 at 11:09
  • `POSIXlt` is a needlessly large data type (it uses 40 bytes iirc) and `data.table` will not store it as a simple column. You should use `POSIXct` instead if you want to store it in a `data.table`. – eddi Oct 22 '14 at 15:17

0 Answers0