I am currently attempting to use read_table()
function from the readr package on a few large data files. I only want the second column so I set all the other columns NULL with this argument in the function:
col_types = c(paste("_", "c", paste(rep("_", 20000), sep = "", collapse = ""), sep = "", collapse = ""))
EDIT: There should be an underdash between the 1st and 3rd pair of closed quotes in the code above.
However, read_table seems to insist on reading in the entire data file (And using up excessive memory and causing a crash) instead of just reading in column 2.
With read.table()
, I have tried a similar argument: colClasses = c("NULL", "character", rep("NULL", 20000)
which works perfectly without taking up excess memory but I would like to use read_table
since it is supposedly faster. Any ideas on why read_table
is taking up so much memory even though I am including an argument to only keep one column?