What is the best way to execute 100k insert statements?

Question

I have created a set of 100k insert queries to generate data in multiple oracle tables for my performance testing. What is the best way to execute this ?

In the past, I've tried tools like Oracle SQL developer and Toad. However not sure if it can handle this large volume.

Simple insert statements like -

INSERT INTO SELLING_CODE (SELC_ID, VALC_ID, PROD_ID, SELC_CODE, SELC_MASK, VALC_ID_STATUS) 
VALUES (5000001, 63, 1, '91111111', 'N/A', 107);

"In the past, I've tried tools like Oracle SQL developer and Toad. However not sure if it can handle this large volume." -- Have you tried? What happened? — sticky bit, May 13 '19 at 10:23
Please [read this SO question](https://stackoverflow.com/questions/7306405/best-way-to-bulk-insert-data-into-oracle-database). It may not be a good idea to try to directly run so many insert statements. — Tim Biegeleisen, May 13 '19 at 10:24
Executing 100,000 individual insert statements is not a sensible approach. It's slow, and also unwieldy to maintain or debug. Which approaches might be better depend upon your specific circumstance: SQL\*Loader or an external table if the target table is empty, INSERT INTO .. SELECT … or MERGE otherwise.. — APC, May 13 '19 at 10:29
Best ways are generally a multi insert (INSERT ALL) or running a transaction — Raymond Nijland, May 13 '19 at 10:37
Add a commit after each 100 statements or so, drop all indexes on the tables except for primary key indexes prior to inserting the data and have DDL on hand to recreate the indexes after your inserts are all performed and committed, and you should be fine. — Bob Jarvis - Слава Україні, May 13 '19 at 11:33
@BobJarvis if inserting 1 row into a table, why dropping the indexes ? as per my knowledge if doing DML on large set values on large table the index is better to be removed because each time you modified on table segment, you are going to modify on indexes segment. but 1 row is fast. right? — Moudiz, May 13 '19 at 11:48
SQL*Loader with DIRECT=TRUE will load 100K rows in seconds AND update your indexes after the fact. That doesn't directly answer your question but is a more reasonable approach to your data challenge. — Stilgar, May 13 '19 at 13:13
@Stilgar: I seem to recall that SQL\*Loader with DIRECT=TRUE had some limitations. It's been a while, but IIRC you couldn't use RECNUM, errors weren't reported on an individual record basis, or something of that nature. For my mileage dropping indexes, SQL\*Loader with DIRECT=FALSE, and re-adding the indexes worked as needed and was sufficiently quick when loading files upwards of 5 million lines (1 GB+ in size). YMMV. — Bob Jarvis - Слава Україні, May 13 '19 at 13:57
If you are just looking to get the data into the database as quickly as possible, I recommend some form of BULK INSERT. It appears that using SQL*Loader is the way to do so in Oracle, but there is a BULK INSERT statement in MS SQL Server which also loads from a CSV file. If you have to repeat this often, consider having a test table where you can truncate your prod table and select into prod from test table to reload it. — Michael Dougan, May 13 '19 at 17:03

score 4 · Accepted Answer · answered May 14 '19 at 03:14

Inserting 100,000 rows with SQL statements is fine. It's not a huge amount of data and there are a few simple tricks that can help you keep the run time down to a few seconds.

First, make sure that your tool is not displaying something for each statement. Copying and pasting the statements into a worksheet window would be horribly slow. But saving the statements into a SQL*Plus script, and running that script can be fast. Use the real SQL*Plus client if possible. That program is available on almost any system and is good at running small scripts.

If you have to use SQL Developer, save the 100K statements in a text file, and then run this as a script (F5). This method took 45 seconds on my PC.

set feedback off
@C:\temp\test1.sql

Second, batch the SQL statements to eliminate the overhead. You don't have to batch all of them, batching 100 statements-at-a-time is enough to reduce 99% of the overhead. For example, generate one thousand statements like this:

INSERT INTO SELLING_CODE (SELC_ID, VALC_ID, PROD_ID, SELC_CODE, SELC_MASK, VALC_ID_STATUS)
select 5000001, 63, 1, '91111111', 'N/A', 107 from dual union all
select 5000001, 63, 1, '91111111', 'N/A', 107 from dual union all
...
select 5000001, 63, 1, '91111111', 'N/A', 107 from dual;

Save that in a text file, run it the same way in SQL Developer (F5). This method took 4 seconds on my PC.

set feedback off
@C:\temp\test1.sql

If you can't significantly change the format of the INSERT statements, you can simply add a BEGIN and END; / between every 100 lines. That will pass 100 statements at a time to the server, and significantly reduce the network overhead.

For even faster speeds, run the script in regular SQL*Plus. On my PC it only takes 2 seconds to load the 100,000 rows.

For medium-sized data like this it's helpful to keep the convenience of SQL statements. And with a few tricks you can get the performance almost the same as a binary format.

What is the best way to execute 100k insert statements?

1 Answers1