1

I'm using R to insert a data.frame into a MySQL database. I have this code below that inserts 1000 rows at a time successfully. However, it's not practical if I have a data.frame with tens of thousands of rows. How would you do a bulk insert using R? is it even possible?

## R and MySQL
library(RMySQL)

### create sql connection object 
mydb = dbConnect(MySQL(), dbname="db", user='xxx', password='yyy', host='localhost', unix.sock="/Applications/MAMP/mysql/mysql.sock")

# get data ready for mysql 
df = data.format

# chunks 
df1 <- df[1:1000,]
df2 <- df[1001:2000,]
df3 <- df[2001:nrow(df),]

## SQL insert for data.frame, limit 1000 rows 
dbWriteTable(mydb, "table_name", df1, append=TRUE, row.names=FALSE)
dbWriteTable(mydb, "table_name", df2, append=TRUE, row.names=FALSE)
dbWriteTable(mydb, "table_name", df3, append=TRUE, row.names=FALSE)
Nodedeveloper101
  • 421
  • 1
  • 9
  • 24
  • 3
    What happens if you try to write it all at once with `dbWriteTable(mydb, "table_name", df)`? – Gregor Thomas Jan 21 '16 at 19:46
  • Take a look at my answer http://stackoverflow.com/questions/34591444/how-to-insert-data-to-sql-server-table-using-r/34591753#34591753. Although it speaks about SQL Server the same applies in your situation. – danas.zuokas Jan 21 '16 at 20:29

3 Answers3

3

For completeness, as the link suggests, write the df to a temp table and insert into the destination table as follows:

dbWriteTable(mydb, name = 'temp_table', value = df, row.names = F, append = F)
dbGetQuery(mydb, "insert into table select * from temp_table")
PuzzledVacuum
  • 453
  • 4
  • 9
1

Fast bulk insert is now supported by the DBI-based ODBC package, see this example posted by Jim Hester (https://github.com/r-dbi/odbc/issues/34):

library(DBI);
con <- dbConnect(odbc::odbc(), "MySQL")
dbWriteTable(con, "iris", head(iris), append = TRUE, row.names=FALSE)
dbDisconnect(con)

Since RMySQL is also DBI-based you just have to "switch" the DB connection to use the odbc package (thanks to the standardized DBI interface of R).

Since the RMySQL package

... is being phased out in favor of the new RMariaDB package.

according to their web site (https://github.com/r-dbi/RMySQL) you could try switching the driver package to RMariaDB (perhaps they have already implemented a bulk insert feature).

For details see: https://github.com/r-dbi/RMariaDB

R Yoda
  • 8,358
  • 2
  • 50
  • 87
0

If all else fails, you could put it in a loop:

for(i in 0:floor(nrow(df)/1000)) { 
  insert_set = df[(i*1000 + 1):((i+1)*1000),]
  dbWriteTable(mydb, "table_name", insert_set, append=T, row.names=F)
}
Brian D
  • 2,570
  • 1
  • 24
  • 43