I have a table (df
) of about 50,000 rows and 12 columns to write to SAP HANA.
I use the RJDBC
library and write row by row as follows:
# Returns the sql statement to insert one row
build_insert_string <- function(db_output, row) {
row_string <- paste(row, collapse="','")
statement <- paste('INSERT INTO "', db_output$SCHEMA, '"."',db_output$table_name,'" (',db_output$string_of_columns,') VALUES (\'', row_string, '\');', sep='')
return(statement)
}
# Insert row by row
for(i in 1:nrow(df)){
tryCatch({ dbGetQuery(jdbcConnection, build_insert_string(db_output, df[i,])) }, error = function(e) {handle_db_errors(e)})
}
where db_output
is a list containing the output constants (schema, table and columns).
Currently, it takes almost half of day to write the table. It seems that HANA does not support batch inserts such as:
INSERT INTO example
(example_id, name, value, other_value)
VALUES
(100, 'Name 1', 'Value 1', 'Other 1'),
(101, 'Name 2', 'Value 2', 'Other 2'),
(102, 'Name 3', 'Value 3', 'Other 3'),
(103, 'Name 4', 'Value 4', 'Other 4');
Did anyone encounter this challenge, and if so, did you find a way to circumvent it and improve the writing efficiency?