0

An empty table T1 where rows have to be inserted by selecting rows from another table T2 in ORACLE.

Like,

INSERT INTO T1
SELECT * FROM T2; 

The issue is table T2 has about 10 million of rows. This simple SELECT statement seems to execute around 25-30 secs individually. But when it inserts into T1, it takes 20-30 mins to complete.

Why the above statement is taking long time to execute and what is the best approach or how to improve upon to insert data to table T1 selecting from table T2?

Nafi Pantha
  • 169
  • 1
  • 3
  • 16
  • Are there any indexes on T1? – Nick.Mc Jan 09 '20 at 00:21
  • @Nick.Mcdermaid first there was none...then I added one...but didn't improve much... – Nafi Pantha Jan 09 '20 at 03:36
  • Indexes on the target table will slow things down for a pure insert. It's typical for large loads of tables to drop all indexes, load the data then recreate indexes.... although this is SQL Server experience I'm sure it mostly applies for Oracle. It doesn't seem like that's your issue though. – Nick.Mc Jan 09 '20 at 03:49

2 Answers2

2

Adding an APPEND hint may enable a direct path insert, which can avoid generating extra REDO data used for recovery:

INSERT /*+ append */ INTO T1
SELECT * FROM T2;

Adding parallelism can further improve performance:

ALTER SESSION ENABLE PARALLEL DML;

INSERT /*+ parallel append */ INTO T1
SELECT * FROM T2;

Those two features could shrink the run time from minutes to seconds but there are a lot of caveats you need to understand. Direct-path writes lock the table, and are not recoverable; if the data is important you may not want to wait for the next full backup. Parallel queries work harder, not smarter, and may steal resources from more important jobs. Finding the optimal degree of parallelism is tricky, and direct-path inserts have many limitations, like triggers and some kinds of referential integrity constraints.

With the right hardware, system configuration, and code, you can realistically improve performance by 100x. But if you're new to these features, prepare to spend hours learning about them.

Jon Heller
  • 34,999
  • 6
  • 74
  • 132
  • You can read more about direct-path inserts on this similar question: https://stackoverflow.com/questions/51432405/why-create-table-as-select-is-more-faster-than-insert-with-select – kfinity Jan 08 '20 at 19:14
2

For one thing, the "apparent" execution time of a simple SELECT query is a bit misleading: the database engine figures out how to do the query then returns only the first "chunk" of information to you. (As you then move through the dataset, additional "chunks" are transparently supplied as needed.) But when you specify INSERT, now the database has no choice but to actually go through all those millions of rows.

There are often specialized tools that are specifically intended for "bulk" data operations such as this one. These might be significantly faster.

Another standard practice is to temporarily disable indexes. This avoids the overhead of updating the indexes for every record: the index will be completely rebuilt when you turn it back on. (The "bulk operations" tools aforementioned will usually do things like that automagically.)

Mike Robinson
  • 8,490
  • 5
  • 28
  • 41