I'm trying to implement a 2-phase commit over a document database which supports only single-phase commit using transaction info on the inserted data.
I understand the theory behind 2-phase commit, but in practice there are a few obstacles.
Let's say there is a collection with a unique ID defined, and the transaction includes updating several fields of a document. So in order to save old data for the case of a rollback, I can't override the existing document, but need to add a new document and tag it with the transaction ID:
Solution 1:
Prepare phase:
- Insert a new document to the main collection with the updated fields and mark it with the current transaction ID.
- Mark the old document as deleted until committing and mark it with the current transaction ID.
Commit phase:
- Remove the transaction ID from the added (updated) document.
- Delete the old document (which was marked as deleted)
Problem: part 1 of the prepare phase will fail since I'm adding another document with the same unique ID (although it's just for the sake of the transaction but it will still fail).
Solution 2:
Prepare phase:
- Insert a new document to a temporary collection with the updated fields and mark it with the current transaction ID.
- Mark the old document from the main collection as deleted until committing and mark it with the current transaction ID.
Commit phase:
- Move the new document from the temporary collection to the main collection.
- Delete the old document (which was marked as deleted) from the main collection.
Problem: Commit phase part 1 may fail if there is a unique key conflict when moving the document from the temporary collection to the main collection.
However I can't allow this error to occur during the commit phase, since it's too late, I would like a unique key constraint to fail during the prepare phase.
So what is the right way of implementing the 2 phase commit mechanism and avoid these issues?