I am using read.csv()
to make a data.table
. When importing the columns, I need them to be imported as either 'character' or 'numeric'.
I'm using the following code (simplified for brevity):
dataCols <- c(a="character", b="character", c="numeric", d="character")
data <- data.table(read.csv(file="data.csv", row-names=1, stringsAsFactors=F, colClasses=dataCols))
For ease, I would like to have the dataCols vector be a list of all possible columns as I'm reading a number of csv files which represent the data at various parts of a process (which my code is meant to be checking for equality).
If I use the above code to read a csv file which has all the columns a, b, c and d it reads okay. If, however, I try to read a csv which only has columns a-c, I get the following error:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
scan() expected 'a real', got '"abc"'
where "abc" is the contents of row 1 in column b.
I'm telling it to read the column as a character, and it's getting a character, but it's giving me an error. Why is this? Frustratingly, when I was doing this with a different thing the other day, if i put extra colClasses in it just gave me a warning that said 'there are more colclasses listed than exist in your csv'.
I'm completely at a loss as to why these errors are a) different and, in the case of the problem I described above, even appearing in the first place.