0

The question is about the behavior of simultaneous SELECT and UPDATE in MySQL InnoDB table:

We have a relatively large table which we periodically scan reading several fields including a field named, say, LastUpdate. During the scan we update previously scanned rows. Updates are batched and performed in a background thread - using a different connection. It is important to note that we update rows that have already been read.

Three questions:

  1. Will InnoDB will save previous versions of the updated rows since the SELECT is still in progress?
  2. Would using READ-UNCOMMITTED for the SELECT help?
  3. How can I confirm that InnoDB saves or does not save previous versions of the modified rows in its redo-log.
nimrodm
  • 23,081
  • 7
  • 58
  • 59

1 Answers1

1

Ignore the issue, and write code that won't get in trouble:

BEGIN;
SELECT id ... LIMIT 20; -- get list of _possible_ candidates for action
COMMIT;
**
foreach $candidate
    BEGIN;
    SELECT ..., is_candidate WHERE id = $candidate FOR UPDATE;
    if still a candidate
        process it
        UPDATE ...;
    COMMIT;

If someone slips in, say, at **, the check later will prevent double-processing. Furthermore, the second BEGIN..COMMIT prevents anyone from slipping in.

FOR UPDATE "locks" the row; this is important.

The advantage of processing this way is that the transactions are quick, thereby minimizing impact on everything else.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Not sure I understand the answer. The SELECT statement in my case potentially returns a *lot* of rows (100,000 or more). The question was whether I can somehow tell MySQL I have finished processing the first 1000 rows and there is no need to keep the old versions of these rows while they are updated. – nimrodm Jun 26 '16 at 10:36
  • Probably the only way is to break it into clumps yourself. – Rick James Jun 27 '16 at 04:44
  • Well, using LIMIT/OFFSET is very inefficient and using a server fetch cursor forces MySQL to create a temporary table -- which again means copying all the data. Perhaps partitioning by the primary key would work best. Thanks! – nimrodm Jun 27 '16 at 05:48
  • 1
    `OFFSET` is terribly inefficient; `PARTITIONing` won't help. Walking through table by the PK is a good way to do it. [More on chunking](http://mysql.rjweb.org/doc.php/deletebig#deleting_in_chunks). – Rick James Jun 27 '16 at 05:56