row.names in read.csv vs read.csv.sql (package sqldf)

Question

the description for the row.names argument in read.csv.sql simply says "As in ‘read.csv’"

However when I try to read in a simple csv file with the first column as the row names the behavior in read.csv.sql is not what I expect.

 d <- data.frame("a"=c(1:10), "b"=c(15:24), "c"=c(21:30), row.names=paste("r", c(1:10), sep=""))
 write.csv(d,"foo.txt", quote=T)
 head(read.csv("foo.txt", row.names=1), 3)
   a  b  c
r1 1 15 21
r2 2 16 22
r3 3 17 23

read.csv gives what I might have hoped for. When I try read.csv.sql however:

 head(read.csv.sql("foo.txt", row.names=1), 3)
 Error in try({ :
 RS-DBI driver: (RS_sqlite_import: ./foo.txt line 2 expected 5 columns of data but found      4)
 Error in sqliteExecStatement(con, statement, bind.data) :
 RS-DBI driver: (error in statement: no such table: file)

I've tried with different little things like whether or not to include quotes in the original csv file, or calling read.csv.sql with just header=T as an argument, and at best I can either get the row names as the first column of my final data frame, which would of course require further modification to get this first column as the row name and remove the first column, or to have a straightforward data frame with the row names being just numbers and the desired row names being completely lost.

Is there something I am missing in either my call to the function or the file format to get the much faster on large datasets read.csv.sql reading in column 1 as the row names without further processing of the data frame?

The "problem" is that in your txt file, the first column (the one with the row names) has an "empty" column name (e.g., `row 1 = "", "a", "b", "c"`). This confuses `read.csv.sql(...)`, but not `read.csv(...)` apparently. If you go into a text editor and remove that `"",` at the very beginning, your code works. — jlhoward, Feb 07 '14 at 20:58
I did try that and with my version of R (3.0.2) and my OS (Ubuntu 12.04.4) read.csv.sql("foo.txt", header=T, row.names=1) reads row names as numbers (i.e. "r1" is changed to 1, etc). This is certainly better than giving an error message and not reading anything, but still defeats the purpose of row names. — user3059448, Feb 07 '14 at 21:11
You're right; it does that for me too. You might consider waiting a day and then sending an email to the maintainer [identified here](http://cran.r-project.org/web/packages/sqldf/index.html) — jlhoward, Feb 07 '14 at 22:24
You might also try [R Help](https://stat.ethz.ch/mailman/listinfo/r-help), if you haven't already. — jlhoward, Feb 08 '14 at 02:29

row.names in read.csv vs read.csv.sql (package sqldf)

0 Answers0