So I ended up doing some bench marking for this comparing the accepted answer with reading the table into R and then saving to csv since my goal was to cut down on compute time. Thought others might find this helpful and the results are fairly interesting. I tested it on my windows laptop as well as a super computer that I have access to. I did three different size tables for each environment.
For my laptop this was the commands.
microbenchmark(
sys = {shell('sqlite3 -header -csv C:/Users/jmc6538/Documents/testing/test.sqlite "select * from iris;" > C:/Users/jmc6538/Documents/testing/test.csv')},
RSQLite = {
db <- dbConnect(RSQLite::SQLite(), "test.sqlite")
result <- dbSendQuery(db, sql)
write.csv(dbFetch(result), "./test2.csv")
dbClearResult(result)
dbDisconnect(db)
})
The different sized tables were a single copy of the iris
data (150 rows), the iris
data set 4,502 times (675,300 rows), and the iris
data set 450,200 times (6,753,000 rows). Here are the results in the same order.
Unit: milliseconds
expr min lq mean median uq max neval cld
sys 213.758862 216.612667 223.37484 217.858642 218.88561 342.42205 100 b
RSQLite 6.981669 7.663255 10.08729 9.377483 10.90106 25.52816 100 a
Unit: seconds
expr min lq mean median uq max neval cld
sys 10.633965 11.788077 12.844148 12.967207 13.757598 16.142357 100 b
RSQLite 5.164632 6.791199 7.172043 7.351825 7.728878 8.486433 100 a
Unit: seconds
expr min lq mean median uq max neval cld
sys 94.38379 101.36282 103.49785 102.98158 104.52018 114.95047 100 b
RSQLite 48.48948 50.28782 53.18634 53.64562 55.80938 63.87987 100 a
So on my laptop, using RSQLite
and write.csv
was always faster by a significant amount.
On the super computer the number of rows were 150, 900,000, and 9,000,000, and the commands were
microbenchmark(
sys = {system('sqlite3 -header -csv C:/Users/jmc6538/Documents/testing/test.sqlite "select * from iris;" > C:/Users/jmc6538/Documents/testing/test.csv')},
RSQLite = {
db <- dbConnect(RSQLite::SQLite(), "test.sqlite")
result <- dbSendQuery(db, sql)
write.csv(dbFetch(result), "./test2.csv")
dbClearResult(result)
dbDisconnect(db)
})
And the results.

On the super computer using the system
call was faster for the larger tables, but not for the small table.
Edit:
Using data.table
's fwrite
to write the CSVs sped up the RSQLite method over the shell method on the super computer. I only did 10 repeats here for the 900,000 and 9,000,000 table because I didn't want to wait for 100. I think this is the better method if people are looking for speed.
