I'm trying to read in a large (3.7 million rows, 180 columns) dataset into R, using the ff
package. There are several data types in the dataset - factor, logical, and numeric.
The problem is when reading in numeric variables. For example, one of my columns is:
TotalBeforeTax
126.9
88.0
124.5
90.9
...
When I try reading the data in, the following error is thrown:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
scan() expected 'a real', got '"126.90000"'
I tried declaring the class to integer
(it's already declared as numeric
) using the colClasses
argument, but to no avail. I also tried changing it to a real
(whatever that is supposed to mean), and it starts reading in the data, but at some point throws:
Error in methods::as(data[[i]], colClasses[i]) :
no method or default for coercing “character” to “a real”
(My guess is, because it comes across an NA
and doesn't know what to do with it.)
The funny thing is, if I declare the column as a factor
, everything reads in nicely.
What gives?