-1

I am using lapply to read a list of files. The files have multiple rows and columns, and I interested in the first row in the first column. The code I am using is:

lapply(file_list, read.csv,sep=',', header = F, col.names=F, nrow=1, colClasses = c('character', 'NULL', 'NULL'))

The first row has three columns but I am only reading the first one. From other posts on stackoverflow I found that the way to do this would be to use colClasses = c('character', 'NULL', 'NULL'). While this approach is working, I would like to know the underlying issue that is causing the following error message to be generated and hopefully prevent it from popping up:

"In read.table(file = file, header = header, sep = sep, quote = quote, : cols = 1 != length(data) = 3"

A. Suliman
  • 12,923
  • 5
  • 24
  • 37
Tanjil
  • 198
  • 1
  • 17

1 Answers1

3

It's to let you know that you're just keeping one column of the data out of three because it doesn't know how to handle colClasses of "NULL". Note your NULL is in quotation marks.

An example:

write.csv(data.frame(fi=letters[1:3],
                            fy=rnorm(3,500,1),
                            fo=rnorm(3,50,2))
,file="a.csv",row.names = F)

write.csv(data.frame(fib=letters[2:4],
                     fyb=rnorm(3,5,1),
                     fob=rnorm(3,50,2))
          ,file="b.csv",row.names = F)

file_list=list("a.csv","b.csv")

lapply(file_list, read.csv,sep=',', header = F, col.names=F, nrow=1, colClasses = c('character', 'NULL', 'NULL'))

Which results in:

[[1]]
  FALSE.
1     fi

[[2]]
  FALSE.
1    fib

Warning messages:
1: In read.table(file = file, header = header, sep = sep, quote = quote,  :
  cols = 1 != length(data) = 3

Which is the same as if you used:

lapply(file_list, read.csv,sep=',', header = F, col.names=F,
 nrow=1, colClasses = c('character', 'asdasd', 'asdasd'))

But the warning goes away (and you get the rest of the row as a result) if you do:

lapply(file_list, read.csv,sep=',', header = F, col.names=F,
  nrow=1, colClasses = c( 'character',NULL, NULL))

You can see where errors and warnings come from in source code for a function by entering, for example, read.table directly without anything following it, then searching for your particular warning within it.

CrunchyTopping
  • 803
  • 7
  • 17
  • Hi thanks for your help. Is there a better solution than the one I have already used? I do not want to suppress all warning messages just yet. – Tanjil Aug 14 '19 at 12:51
  • `col.names` is not binary (T/F) like you have above: it is "a vector of optional names for the variables" according to the help. Get rid of `col.names=F,`, and continue using `"NULL"` in quotation marks for the columns you don't want. – CrunchyTopping Aug 14 '19 at 13:16