2

Is there a way to do a bulk insert using MonetDB.R (not via a for loop and dbSendUpdate)?

Does dbWriteTable allow for updates (append=TRUE)?

About "INSERT INTO" the MonetDB documentation states: "The benefit is clear: this is very straightforward. However, this is a seriously inefficient way of doing things in MonetDB."

Thanks.

  • Hannes just sent 0.9.4 to CRAN -- `append=TRUE` works now :) huzzah! – Anthony Damico Jul 18 '14 at 11:25
  • Brilliant!! COPY INTO (via csvdump=TRUE) is consistently faster than INSERT INTO. Thank you very much for this added feature! I had once issue though: MonetDB doesn't seem to accept the BOOLEANs from the CSV file generated by "write.table" (with csvdump set to TRUE); is there a way to have "write.table" use 1s and 0s instead of TRUE and FALSE? – Bart Lenoir Jul 21 '14 at 18:52
  • could you open a separate SO question and provide a small reproducible example? :) – Anthony Damico Jul 21 '14 at 19:48
  • MonetDB Jan2014-SP3 has just been released, which fixes the BOOLEAN import issue when using csvdump=T – Hannes Mühleisen Jul 31 '14 at 07:52
  • hi, `MonetDBLite` (on CRAN) now replaces `MonetDB.R` and runs embedded (like `RSQLite`). for more detail, see https://github.com/hannesmuehleisen/MonetDBLite/blob/master/README.md – Anthony Damico Jun 18 '16 at 15:51

4 Answers4

2

Hannes might have a smarter solution, but for the time being, this might help :)

# start with an example data set
nrow( mtcars )

# and a MonetDB.R connection
db

# here's how many records you'd have if you stack your example data three times
nrow( mtcars ) * 3

# write to three separate tables
dbWriteTable( db , 'mtcars1' , mtcars )
dbWriteTable( db , 'mtcars2' , mtcars )
dbWriteTable( db , 'mtcars3' , mtcars )

# stack them all
dbSendUpdate( db , "CREATE TABLE mtcars AS SELECT * FROM mtcars1 UNION ALL SELECT * FROM mtcars2 UNION ALL SELECT * FROM mtcars3 WITH DATA" )

# correct number of records
nrow( dbReadTable( db , 'mtcars' ) )
Anthony Damico
  • 5,779
  • 7
  • 46
  • 77
2

I'll consider it. monetdb.read.csv does use COPY INTO, so you might get away with creating a temp. CSV file.

Hannes Mühleisen
  • 2,542
  • 11
  • 13
1

I see what you mean although this doesn't change anything to the fact that dbWriteTable uses a for loop and "INSERT INTO" which can be rather slow. I may not have been very clear in my initial post.

As a workaround I guess "START TRANSACTION" and "COMMIT" with dbSendUpdate might work.

Ideally something like this would be great:

"COPY INTO table FROM data.frame"

  • We will be releasing a new version of the connector soon (tomorrow I hope) that should fix the issue. – Hannes Mühleisen Jul 14 '14 at 12:56
  • I saw code from version 0.4 only: https://r-forge.r-project.org/scm/viewvc.php/pkg/MonetDB.R/?root=monetr&pathrev=12 In that version transactions (added in v0.9) were not yet available. As you would be working on the connector would it be possible to add "append=TRUE" to dbWriteTable? Thank you! – Bart Lenoir Jul 14 '14 at 14:05
1

We just published version 0.9.4 of MonetDB.R on CRAN. The main change in this release are major improvements to the dbWriteTable method. By default, INSERTs are now chunked into 1000 rows per statement. Also, if the database is running on the same machine as R, you may use the csvdump=T parameter. This writes the data.frame to a local temporary CSV file, and uses an automatically generated COPY INTO statement to import. Both these methods obviously are designed to improve the speed with which dbWriteTable imports data. Also, he append/overwrite parameter handling has been fixed.

Hannes Mühleisen
  • 2,542
  • 11
  • 13