2

I have to update table which has around 93 mln records, at the beginning DB updated 10 k records per 5 seconds, now after around 60 mln updated records, update next 10k records take 30-60 s, don't know why I have to update columns which are null.
I use loop with commit each 10 k records:

  LOOP
    UPDATE TABLE
    SET DATE_COLUMN = v_hist_date
    WHERE DATE_COLUMN IS NULL
          AND ROWNUM <= c_commit_limit
          AND NOT_REMOVED IS NULL;

    EXIT WHEN SQL%ROWCOUNT = 0;
    COMMIT;
  END LOOP;

Have you any ideas why it slow down so much and how is possible to speed up this update ?

APC
  • 144,005
  • 19
  • 170
  • 281
Krystian
  • 81
  • 1
  • 6
  • Yes, it's part of PL/SQL code – Krystian Dec 20 '17 at 10:46
  • Why ? i have column with null values, when it will update all, then it will exit when Update return 0 rows, it worked me on smaller tables with e.g. 1-2 mln records and it took much less time, but also there was newer db version – Krystian Dec 20 '17 at 10:50

1 Answers1

5

Updates are queries too. You haven't posted an explain plan but given you are filtering on columns which are null it seems probable that your statement is executing a Full Table Scan. That certainly fits the behaviour you describe.

What happens is this. The first loop the FTS finds 10000 rows which fit the WHERE criteria almost immediately. Then you exit the loop and start again. This time the FTS reads the same blocks again, including the ones it updated in the previous iteration before it finds the next 10000 rows it can update. And so on. Each loop takes longer because the full table scan has to read more of the table for each loop.

This is one of the penalties of randomly committing inside a loop. It may be too late for you now, but a better approach would be to track an indexed column such as a primary key. Using such a tracking key will allow an index scan to skip past the rows you have already visited.

APC
  • 144,005
  • 19
  • 170
  • 281
  • 1
    The best approach would be to do a single update rather than the batched updates that's currently being done, making sure the UNDO tablespace is sized appropriately. – Boneist Dec 20 '17 at 12:31
  • 1
    @boneist - I sort of agree with you but I think it's hard to define "best" in situations like this, without more details. The problem with updating a sizeable chunk of 93m rows in one statement is the risk of it falling over; if that happens we lose all the work and have to do it all over again. On the plus side we gain transactional integrity. There are other options - such as setting a DEFAULT value which might be appropriate - again depending on the (unspecified) objective. – APC Dec 20 '17 at 13:24