3

I feel kind of stupid asking this question, but to clear things out, sometimes stupid questions must be asked :)

So, we can define a write skew as Martin Kleppmann did in his talk:

Write skew pattern:
1. read something
2. make a decision
3. write decision
By the time the write(3) is committed, the premise(1) of the decision(2) is no longer true

There is a pessimistic approach, when we basically say "only one subject can use shared resource in a given moment, others should wait before the subject finishes".

And then there is an optimistic approach, with phases as defined in Wikipedia:

I. Begin: Record a timestamp marking the transaction's beginning.
II. Modify: Read database values, and tentatively write changes.
III. Validate: Check whether other transactions have modified data that this transaction has used (read or written). This includes transactions that completed after this transaction's start time, and optionally, transactions that are still active at validation time.
IV. Commit/Rollback: If there is no conflict, make all changes take effect. If there is a conflict, resolve it, typically by aborting the transaction, although other resolution schemes are possible.

My question is, what guarantees do we have that new "knowledge" is not being written while validation (III) is happening, thus fulfilling the definition of write skew given above?

Basically, that validation module on phase III must keep some inner ledger and process them in serial way, so that checking process from transaction2 doesn't happen before transaction1 write event.

Have we just moved this whole problem of write skew one level down? So we have a serializable pessimistic approach on low level to be able to have an optimistic approach on a higher level? Am I getting something wrong?

I'd appreciate any clarifications.

Twice_Twice
  • 527
  • 4
  • 16

1 Answers1

2

For optimistic locking to work 'III. Validate' and 'IV. Commit/Rollback' need to be a single atomic operation. So in this sense, yes "we just moved this whole problem of write skew one level down".

However, 'II. Modify' is a user operation outside of the control of the database, which can take a long time to complete and cannot be optimized by the database implementation. 'III. Validate' and 'IV. Commit/Rollback' OTOH are operations implemented by the database, which can be optimized to be fast by the database implementation.

michid
  • 10,536
  • 3
  • 32
  • 59
  • I've asked this question also because on even higher "microservices + event sourcing" level we get OCC by using local database + single threading, so we basically moving write skew problem on different levels? And the whole idea boils down to "in the end someone at the bottom needs to process sequentially and pessimistically" , right? – Twice_Twice Jun 13 '19 at 08:48
  • 1
    @Twice_Twice right, unless conflicts can be avoided entirely (e.g. by using CRDTs, but that's another story). But in the face of potentially conflicting updates there is always a certain window of mutual exclusive access. Keeping this window as small as possible and not extend it over user actions is a goal of OCC. – michid Jun 13 '19 at 09:01
  • 1
    big thanks, now I understand this topic on a more deep level! – Twice_Twice Jun 13 '19 at 09:22