0

I'm reading a large CSV file (>15 GB) line by line in R. I'm using

con  <- file("datafile.csv", open = "r")
while (length(oneLine <- readLines(con, n = 1, warn = FALSE)) > 0) {
    # code to be written
}

In the "code to be written" section, I need to be able to refer to individual elements in each row and save them to an array. The file has no headers if that's important.

Thanks!

2 Answers2

1

You could use read.table with argument text to parse oneLine string as if it were a csv file:

# set your arguments: separator, decimal separator etc...
x <- read.table(text=oneLine, sep=",", dec=".", header=F) 

The returned x is a data.frame with one row only that you can easily turn into an array.

digEmAll
  • 56,430
  • 9
  • 115
  • 140
1

You could do something like this:

CHUNK_SIZE <- 5000
con <- file('datafile.csv', 'rt')
res <- NULL
while (nrow(chunk <- read.csv(con, nrow = CHUNK_SIZE, header = FALSE, stringsAsFactors = FALSE)) > 0) {
  res <- rbind(res, chunk)
  if (nrow(chunk) < CHUNK_SIZE) break
}
Karl Forner
  • 4,175
  • 25
  • 32