Percona XtraDB Cluster multi-node writing and unexpected deadlocks outside of transaction?

Question

I am having trouble finding an answer to this using google or Stack Overflow, so perhaps people familiar with Percona XtraDB can help answer this. I fully understand how unexpected deadlocks can occur as outlined in this article, and the solution is to make sure you wrap your transactions with retry logic so you can restart them if they fail. We already do that.

https://www.percona.com/blog/2012/08/17/percona-xtradb-cluster-multi-node-writing-and-unexpected-deadlocks/

My questions is about normal updates that occur outside of a transaction in auto commit mode. Normally if you are writing only to a single SQL DB and perform an update, you get a last in wins scenario so whoever executes the statement last, is golden. Any other data is lost so if two updates occur at the same time, one of them will take hold and the others data is essentially lost.

Now what happens in a multi master environment with the same thing? The difference in cluster mode with multi master is that the deadlock can occur at the point where the commit happens as opposed to when the lock is first taken on the table. So in auto commit mode, the data will get written to the DB but then it could fail when it tries to commit that to the other nodes in the cluster if something else modified the exact same record at the same time. Clearly the simply solution is to re-execute the update again and it would seem to me that the database itself should be able to handle this, since it is a single statement in auto commit mode?

So is that what happens in this scenario, or do I need to start wrapping all my update code in retry handling as well and retry it myself when this fails?

score 0 · Answer 1 · answered Jun 28 '16 at 20:32

0

Autocommit is still a transaction; a single statement transaction. Your single statement is just wrapped up in BEGIN/COMMIT for you. I believe your logic is inverted. In PXC, the rule is "commit first wins". If you start a manual transaction on node1 (ie: autocommit=0; BEGIN;) and UPDATE id=1 and don't commit then on node2 you autocommit an update to the same row, that will succeed on node2 and succeed on node1. When you commit the manual UPDATE, you will get a deadlock error. This is correct behavior.

It doesn't matter if autocommit or not; whichever commits first wins and the other transaction must re-try. This is the reason why we don't recommend writing to multiple nodes in PXC.

Yes, if you want to write to multiple nodes, you need to adjust your code to "try-catch-retry" handle this error case.

answered Jun 28 '16 at 20:32

utdrmac

731
5
17

We have full wrapping retry handling for manual transactions already (pretty necessary for it to work properly), so my question was really about whether auto commit transactions would ever fail. The situation I am thinking of is when two nodes attempt to perform transactions on the same record at the same time. Node 1 would complete and Node 2 would complete separately but when it goes to replicate it, one of them has to fail? So does PXC auto try the auto commit one that fails? – Kendall Bennett Jun 29 '16 at 15:29
Yes, autocommit transactions can fail too. They are transactions afterall. It makes no difference if autocommit=0 or =1, they are treated the same. Your statement "node 2 would complete separately" is incorrect. In your example node2 would NOT complete on commit. Certification happens BEFORE commit happens so node 2 would receive an error on commit. Node 1 beings its commit process. This involves replicating the write set to all nodes (including itself) and certifying. Then actual on-disk commit happens. Node 2 on commit, would also replicate and attempt to certify locally, which would fail. – utdrmac Jul 15 '16 at 04:09
I never said it would not fail. I know it will fail; the question is will the database automagically retry the transaction if it fails while in auto commit mode because it already knows it is atomic and has everything it needs to restart it? – Kendall Bennett Jul 15 '16 at 14:37
The cluster is capable of retrying autocommitted statements automatically (wsrep_retry_autocommit), but not regular (manual) multi-statement transactions. From Percona manual: This variable sets the number of times autocommitted transactions will be tried in the cluster if it encounters certification errors. In case there is a conflict, it should be safe for the cluster node to simply retry the statement without the client’s knowledge with the hopes that it will pass the next time. – utdrmac Jul 16 '16 at 17:29
Thanks that is what I was looking for – Kendall Bennett Jul 17 '16 at 05:30

Percona XtraDB Cluster multi-node writing and unexpected deadlocks outside of transaction?

1 Answers1