Why MariaDB Galera cluster thrown a deadlock instead of waiting?

Question

I'm trying to move my application from a single server database to a Galera cluster, this application uses some FOR UPDATE locks.

Consider this case:

create table rooms ( id int not null, owner int null, PRIMARY KEY (`id`));
insert into rooms values (1, null);

I want to be sure to not overwrite room owner in case of concurrent queries. I will set the owner ID in the room only if the current owner is NULL.

1st TEST - SINGLE DATABASE SERVER

TRX1 - SERVER A

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
START TRANSACTION;
SELECT * FROM rooms WHERE id=1 FOR UPDATE;

TRX2 - SERVER A

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
START TRANSACTION;
SELECT * FROM rooms WHERE id=1 FOR UPDATE;

-> here the TRX2 is on hold waiting TRX1.

TRX1 - SERVER A

UPDATE rooms SET owner=1 WHERE id=1;
COMMIT;

-> here the TRX2 is released and the room is retrieved with the updated owner from TRX1. This is the current correct behaviour for a correct concurrency handling.

2st TEST - GALERA CLUSTER WITH 2 DATABASE SERVERS

TRX1 - SERVER A

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
START TRANSACTION;
SELECT * FROM rooms WHERE id=1 FOR UPDATE;

TRX2 - SERVER B

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
START TRANSACTION;
SELECT * FROM rooms WHERE id=1 FOR UPDATE;

-> here the TRX2 is NOT put on hold!

TRX1 - SERVER A

UPDATE rooms SET owner=1 WHERE id=1;
COMMIT;

TRX2 - SERVER B

UPDATE rooms SET owner=2 WHERE id=1;
COMMIT;

-> here the TRX2 fails with DEADLOCK error.

Is there any solution to let the second transaction waits the commit of the first trx even if it is running on a second clusetered server?

Yes -- But you have to code it. You need to check for deadlocks anyway. — Rick James, Feb 11 '23 at 00:07
Usually I handle a deadlock with an system Exception, so the user is notified the operation was not completed. Anyway, do you confirm this is how lock works with Galera? There is no way to wait for some time to acquire the lock (like in a single dB scenario) instead of throwing a deadlock at commit? — Tobia, Feb 12 '23 at 07:03
"Notification" just tells you that there was a problem. Catching the error and rerunning the transaction is almost always the "right" way to _solve_ the deadlock. — Rick James, Feb 12 '23 at 18:57
As for "waiting" -- That does happen on the originating node (cf `innodb_lock_wait_timeout`), but once it gets to `COMMIT`, it does not try to "wait". Anyway, if you get a deadlock across the network, it is better to figure out what is causing it and find a cure. (Sometimes it is as simple as adding a `INDEX`.) — Rick James, Feb 12 '23 at 19:00
As galera is optimistic locking, the deadlock will always be in the commit and there is no pre-emptive lock (like `for update`) wait across the cluster. In addition to what @RickJames said about indexes, if possible direct likely conflicts to the same galera node. But you're going to have to write a deadlock handler anway. — danblack, Feb 13 '23 at 04:37
@RickJames the point is that I was not experiencing deadlocks before moving to replicated environment, and what I discovered with my test is that deadlock is thrown instead of lock wait. To be honest I don't know how usually applications handle shared resources locks in replicated servers. Is it always a matter of deadlock handling and re-tries? — Tobia, Feb 13 '23 at 11:47
@danblack so there is not other solution than handle deadlock instead of lock time wait? Is this the common way for shared-resource handling pattern? — Tobia, Feb 13 '23 at 11:49
@Tobia - Please discover the transaction that is being aborted, plus the transaction that seems to be triggering such. Then present the statements in the two transactions, plus `SHOW CREATE TABLE` for the relevant table(s). We may be able to point out ways to minimize (or maybe eliminate) the deadlock. — Rick James, Feb 13 '23 at 18:25
Ok, but the test case is exactly in the example in the question. A very simple table with a single index and two transactions with the same "FOR UPDATE" lock running in 2 different replicated servers. As you can see it is not possibile to eliminate the deadlock. Indeed my question is not how to eliminate or handle deadlocks, but if it is confirmed that is a standard behaviour to throw deadlock instead of lock wait in a replicated environment. — Tobia, Feb 14 '23 at 07:39

score 1 · Accepted Answer · answered Feb 13 '23 at 20:47

Galera is optimistic locking. This pattern assumes there are going to be no conflicts during the commit and if so, the deadlock is throws on the COMMIT SQL. The assumption of optimistic means that there is no pre-lock mechanism communicated over the cluster and the only communication happens on commit.

This also means there is no lock wait, or an early deadlock, if another Galera node as a transaction in progress for the same rows that are being updated locally.

This means there is, as observed, a difference in behaviour between local transaction and transactions across multiple nodes in a Galera cluster.

As the entire pattern is optimistic, while you may be able to avoid deadlocks in a local scenario by using a consistent order of acquiring locks, the same is not possible in a Galera cluster. Minimization techniques like having correct indexes that match the update query are effective in Galera as they are locally, so in the question asked, ensuring that id is a primary key or the first element of a secondary index.

Deadlock handling code must be done in a Galera cluster to handle conflicts. There are deadlocks as a exception because at the SQL cannot determine the logical course of action. Its up to the business logic to decide what happened in these cases. As this is a reservation system on rooms, the logical (as determined by me, a non-business case aware observer), is to give the error to the user they where too slow to book that room and present other options.

To avoid the deadlock you would need to make your application perform room booking on a predictable Galera node to get the same effects of a lock wait.

Note that select for update .. skip locked, a useful pattern in reservations that don't strictly depend on which row, also only applies locally.

Thanks danblack, this is exactly the confirmation I needed. Can you please explain what do you mean for "perform room booking on a predictable Galera node". Did you mean my application has to converge to the same database server for all this booking operation? — Tobia, Feb 14 '23 at 07:43
That was the intent. Its not a very pleasant thing to consider it code especially considering the fallover that is mean to handled by galera. — danblack, Feb 14 '23 at 08:10
Ok thanks, I hoped there was a different solution for this patter because locking shared resources in a relational db seems to be a common problem. — Tobia, Feb 14 '23 at 08:14

Why MariaDB Galera cluster thrown a deadlock instead of waiting?

1 Answers1