1

I have a table with 25 fields. The table contains the primary key field "id" and 5 NOT NULL CHECK. The insert performance I have get is ~11000 TPS, wondering if there are any ways to improve the performance.

Some detailed background:

  1. For some reasons, we can't use bulk-insert, i.e., can only use insert into values.
  2. Improving thread number(from 12 to 20 to 100) gives no help and the CPU usage is very low. The machine has 96cores, 755G memory and 1T SSD.
  3. Row size is small in the table, about 0.25KB each row.
  4. Oracle version is 19.3.0 Enterprise Edition.

I did some attempts but the performance gains no improvement:

  1. Define multiple freelist. Have tried 4 freelists, 20 freelists and 80 freelists.
  2. The /*+ APPEND_VALUES */ hint makes performance much worse(1100 TPS).

Thanks in advance.

Hellen
  • 3,472
  • 5
  • 18
  • 25
  • 1
    Are the primary key values derived from a sequence? If so, I would suggest trying [scalable sequences](https://oracle-base.com/articles/18c/scalable-sequences-18c). Since you can't batch the writes, if your threads all get a sequential number they will all be writing to the same block, causing contention. Scalable sequences add some randomness to the front of the number to avoid that contention. – Jon Heller Jun 16 '21 at 05:27
  • 1
    @JonHeller Scalable sequences are primarily for RAC. What is the source of the data you are inserting? Eg, is it a CSV file? What and where is your client application? – BobC Jun 16 '21 at 05:55
  • How do you remove old data? Inserting 11k records every second would fill up 1TB SSD after a few days. – Wernfried Domscheit Jun 16 '21 at 07:10
  • Having 755GB RAM you may consider In-Memory database "Oracle Times Ten" – Wernfried Domscheit Jun 16 '21 at 07:11
  • Are you using a `for` loop with an `insert into ... values (?,?,?,?...)` ? What *performance highlight* do you expect with this approach? – Marmite Bomber Jun 16 '21 at 11:29
  • @JonHeller Thanks a lot, your answer helps me solve the problem. Although the id field is pre-generated by other system, the value of id is monotonically increased. After I changed the algorithm to generate the id randomly, the performance of INSERT improves a lot, i.e., from 11000 to 50000 TPS. Impressive! PS: It's wired that oracle insert data into the same block if the id value is monotonically increased. Any document about it? – Hellen Jun 16 '21 at 15:14
  • 1
    Than your *bottleneck* was probable the index, not the table. Oracle address this contention with the [REVERSE Index](https://oracle-base.com/articles/misc/reverse-key-indexes). The rows are stored (in a *non-clustered table*) independent of the values of the `PK`. – Marmite Bomber Jun 16 '21 at 15:37
  • 1
    @MarmiteBomber Hi, you are right. Using the REVERSE index solve the problem. Thank you. :) – Hellen Jun 17 '21 at 02:24
  • Be wary of reverse key indexes. Firstly they do not support range scans. Secondly, if your index becomes large compared to the buffer cache, you may end up with a physical IO problem. – BobC Jun 17 '21 at 07:06

2 Answers2

2

Scalable sequences, or building your own pseudo-random sequence prefix for each session, can improve the performance of massively concurrent inserts. (The best approach to improving DML performance is usually batching writes, but you said that approach is not feasible in your scenario.)

Creating and using a scalable sequence can be as easy as the following commands.

JHELLER@orclpdb> create sequence test_seq scale;

Sequence created.

JHELLER@orclpdb> select to_char(test_seq.nextval) from dual;

TO_CHAR(TEST_SEQ.NEXTVAL)
----------------------------------------
1017160000000000000000000001

JHELLER@orclpdb> select to_char(test_seq.nextval) from dual;

TO_CHAR(TEST_SEQ.NEXTVAL)
----------------------------------------
1017160000000000000000000002

The problem scalable sequences are solving is best described in the Database Concepts manual section about reverse key indexes:

A reverse key index is a type of B-tree index that physically reverses the bytes of each index key while keeping the column order.

For example, if the index key is 20, and if the two bytes stored for this key in hexadecimal are C1,15 in a standard B-tree index, then a reverse key index stores the bytes as 15,C1.

Reversing the key solves the problem of contention for leaf blocks in the right side of a B-tree index. This problem can be especially acute in an Oracle Real Application Clusters (Oracle RAC) database in which multiple instances repeatedly modify the same block. For example, in an orders table the primary keys for orders are sequential. One instance in the cluster adds order 20, while another adds 21, with each instance writing its key to the same leaf block on the right-hand side of the index.

In a reverse key index, the reversal of the byte order distributes inserts across all leaf keys in the index. For example, keys such as 20 and 21 that would have been adjacent in a standard key index are now stored far apart in separate blocks. Thus, I/O for insertions of sequential keys is more evenly distributed.

Because the data in the index is not sorted by column key when it is stored, the reverse key arrangement eliminates the ability to run an index range scanning query in some cases. For example, if a user issues a query for order IDs greater than 20, then the database cannot start with the block containing this ID and proceed horizontally through the leaf blocks.

However, reverse key indexes are not a great solution to the problem they were built to solve. Reversing the key does distribute the work more evenly, but we don't want completely evenly distributed work. We don't want different sessions to operate on the same block at the same time, but we do want the same session to operate multiple times on the same block while it's still locked and in memory.

Here's the Database Administrator’s Guide section about scalable sequences:

A sequence can be made scalable by specifying the SCALE clause in the CREATE SEQUENCE or ALTER SEQUENCE statement. A scalable sequence is particularly efficient when used to generate unordered primary or unique keys for data ingestion workloads having high level of concurrency. Scalable sequences significantly reduce the sequence and index block contention and provide better data load scalability compared to the solution of configuring a very large sequence cache using the CACHE clause of CREATE SEQUENCE or ALTER SEQUENCE statement.

Scalable sequences improve the performance of concurrent data load operations, especially when the sequence values are used for populating primary key columns of tables in single Oracle database instances as well as Oracle RAC databases.

But I would like to emphasize again that the best way to improve insert performance is to operate on sets of rows if possible. I've seen scalable sequences double performance, but I've seen batching writes improve performance by orders of magnitude.

Jon Heller
  • 34,999
  • 6
  • 74
  • 132
1

I guess you insert with bind parameters - if not use it.

Indexes, triggers and constraints slow down the insert performance. Try to remove them.

Do you use a SEQUENCE for the primary key? If yes, either use scalable sequence as suggested by @Jon Heller (so far, I am not aware of this new feature) or try CACHE 100000 - or don't use SEQUENCE at all.

Are you able to run multiple insert jobs in parallel?

Insert values in a temporary table which does not have any indexes, triggers, constraints, etc. Then run INSERT INTO final_table SELECT * FROM temp_table for example every million rows.

Do you have a partitioned table? If yes, try to use PARTIAL INDEX, i.e. generate the indexs only once per hour/day. Instead of INSERT INTO final_table SELECT * FROM temp_table consider Exchange Partition

I have a similar application, i.e. rather small rows but many insert. My insert limit is also around 10k rows per second (per insert process). I run 8 insert processes in parallel which works fine. And I made the same bad experience with APPEND hint.

Wernfried Domscheit
  • 54,457
  • 9
  • 76
  • 110