0

If we are given a table:

MariaDB [test]> create table foo (
    -> id integer primary key,
    -> version_id integer);
Query OK, 0 rows affected (0.05 sec)

and two rows with primary key 1 and 2:

MariaDB [test]> insert into foo (id, version_id) values(1, 1);
Query OK, 1 row affected (0.01 sec)

MariaDB [test]> insert into foo (id, version_id) values(2, 1);
Query OK, 1 row affected (0.00 sec)

When emitting an UPDATE statement that uses the primary key in the WHERE clause, InnoDB uses an index record lock, as described in https://dev.mysql.com/doc/refman/8.0/en/innodb-locking.html#innodb-gap-locks . That is, it locks each row individually.

Based on this, we can illustrate a simple deadlock between two transactions by emitting an UPDATE for primary key 1 and 2 in reverse order:

transaction 1 # MariaDB [test]> begin;
transaction 1 # Query OK, 0 rows affected (0.00 sec)

transaction 2 # MariaDB [test]> begin;
transaction 2 # Query OK, 0 rows affected (0.00 sec)

transaction 1 # MariaDB [test]> update foo set 
                    -> version_id=version_id+1 where id=1;
transaction 1 # Query OK, 1 row affected (0.01 sec)
transaction 1 # Rows matched: 1  Changed: 1  Warnings: 0

transaction 2 # MariaDB [test]> update foo set 
                    -> version_id=version_id+1 where id=2;
transaction 2 # Query OK, 1 row affected (0.01 sec)
transaction 2 # Rows matched: 1  Changed: 1  Warnings: 0

transaction 1 # MariaDB [test]> update foo set 
                    -> version_id=version_id+1 where id=2;
<blocks on index lock created by transaction 2 on id=2>

transaction 2 # MariaDB [test]> update foo set 
                    -> version_id=version_id+1 where id=1;
transaction 2 # ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

<wakes up>
transaction 1 # Query OK, 1 row affected (22.24 sec)
transaction 1 # Rows matched: 1  Changed: 1  Warnings: 0

And finally, the question. If we instead write these UPDATE statements as a single statement using IN for the list of primary key values, can these two UPDATE statements in different transactions produce the same condition? Note I've also reversed the order of the parameters inside the IN, which shouldn't matter since that's not how I'd expect UPDATE to scan the index. Or is the ordering of the locking of rows deterministic? (or is there some other reason the two statements could not conflict)?

transaction 1 # MariaDB [test]> update foo set 
                    -> version_id=version_id+1 
                    -> where id in (1, 2);
transaction 1 # Query OK, 2 rows affected (0.00 sec)
transaction 1 # Rows matched: 2  Changed: 2  Warnings: 0


transaction 2 # MariaDB [test]> update foo set 
                    -> version_id=version_id+1 
                    -> where id in (2, 1);
# note it blocked until the other transaction was done
transaction 2 # Query OK, 2 rows affected (6.28 sec) 
transaction 2 # Rows matched: 2  Changed: 2  Warnings: 0
zzzeek
  • 72,307
  • 23
  • 193
  • 185
  • What I will do in order to test more closely is the usual trick of putting a sleep() in the statement. – zzzeek Feb 18 '19 at 17:13
  • so trying the sleep trick: e.g. update ...set version_id=version_id+1 +sleep(3) where id in (1, 2), it looks like innodb is locking the row/gap/whatever for all rows *before* it gets to the sleep. that is, the other transaction doesn't seem to start at all if i've only had the first statement running for half a second. hard to tell though – zzzeek Feb 18 '19 at 17:26

1 Answers1

0

Your first example is the classic Deadlock example.

Your second example (with IN) is the beginning of a demonstration of innodb_lock_wait_timeout, in which case one connection can wait and not have to deadlock.

WHERE id IN (...) must atomically handle all the ids in question. This is unlike you first example where it is clear that the rows are being locked one at a time.

It used to be that it was best to sort the IN lists. But I think MySQL sorts them now.

There may be a threshold above which it does a scan instead of reaching for each id individually. This has allegedly led to locking rows that are not mentioned in the IN list.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • so, the UPDATE statement first locks all the rows stated in the IN, and only then does it begin to run the UPDATE on each one? Secondly, when it locks all the rows in the IN, is the other UPDATE statement blocked entirely, and if so what mechanism is that, table lock ? – zzzeek Feb 20 '19 at 16:22
  • @zzzeek - I _think_ it will lock the rows as it goes. When someone else has one of the rows locked, it will check to see whether it is a deadlock or waiting is feasible. Not table lock. – Rick James Mar 04 '19 at 00:48