4

I've been reading about Oracle data consistency guarantees, and supported transaction isolation levels, (e.g. here: https://docs.oracle.com/database/121/CNCPT/consist.htm#CNCPT121) and I feel like I'm getting a lot of high-level information, but I'm not sure how it applies to my specific question.

I'm going to describe a simplified version of my use case, and I'm looking for convincing answers, preferably with references, as to how I need to structure my transaction to get the desired result. (Please don't get too hung up on syntax or data normalization or even data types in my question; it's a straw man -- so if you know what I mean, move on and focus on the concurrency problem.) :)

The Scenario (simplified):

Many users (tens of thousands) are playing an online game simultaneously. Players are all members of two teams, red or blue. Every time a player finishes a game, we want to log the user, their team affiliation, a timestamp, and the score. We also want to aggregate the highest score ever achieved by each team. Our data model looks like this:

// each game is logged in a table that looks kind of like this:
GAMES {
 time NUMBER,
 userid NUMBER,
 team NUMBER,
 score NUMBER
}
// high scores are tracked here, assume initial 0-score was inserted at time 0
HIGH_SCORES {
 team NUMBER,
 highscore NUMBER
}

So, for each score report I receive, I execute a transaction that looks like this

BEGIN
  UPDATE HIGH_SCORES set highscore=:1 WHERE team=:2 and :1>highscore;
  INSERT into GAMES (time, userid, team, score) VALUES (:1,:2,:3,:4);
COMMIT

The invariant I wish to preserve is that, at any point in time, the high score for each team, as shown in the HIGH_SCORES table will be the highest score I would find if I were to scan the GAMES table and find highscore the hard way.

My understanding of the READ_COMMITED isolation level suggests this won't get me what I want:

Conflicting Writes in Read Committed Transactions

In a read committed transaction, a conflicting write occurs when the transaction attempts to change a row updated by an uncommitted concurrent transaction, sometimes called a blocking transaction. The read committed transaction waits for the blocking transaction to end and release its row lock.

The options are as follows:

  • If the blocking transaction rolls back, then the waiting transaction proceeds to change the previously locked row as if the other transaction never existed.

  • If the blocking transaction commits and releases its locks, then the waiting transaction proceeds with its intended update to the newly changed row.

It seems to me that if red team (team 1) has a high score of 100, and two players simultaneously submit better scores, a multithreaded server could have two database transactions going on that begin at the same time:

# Transaction A
UPDATE HIGHSCORES set highscore=150 where team=1 and 150>highscore;
INSERT into GAMES (time, userid, team, score) VALUES (9999,100,1,150);

and

# Transaction B
UPDATE HIGHSCORES set highscore=125 where team=1 and 125>highscore;
INSERT into GAMES (time, userid, team, score) VALUES (9999,101,1,125);

So (In READ_COMMITED mode,) you could get the following sequence: (c.f. Table 9-2 in the Oracle link referenced above)

A updates highscore for red team row; oracle locks this row
B still sees the 100 score and so tries to update red team highscore; 
  oracle Blocks trasaction B because that row is now locked with a conflicting write
A inserts into the games table;
A commits;
B is unblocked, and completes the update, clobbering the 150 with a 125 and my invariant condition will be broken.

First question -- is this a correct understanding of READ_COMMITED?

My reading of SERIALIZABLE, however:

Oracle Database permits a serializable transaction to modify a row only if changes to the row made by other transactions were already committed when the serializable transaction began. The database generates an error when a serializable transaction tries to update or delete data changed by a different transaction that committed after the serializable transaction began.

Suggest that serializable won't do the right thing in the above scenario either, the only difference being that transaction B would get an error and I would have options to rollback or try again. This is doable, but seems unnecessarily difficult.

Second question -- is this a correct understanding of SERIALIZABLE?

... And if so, I am flummoxed. This seems like a simple, common thing to want to do. In code, I could accomplish this trivially by putting a mutex around the test-and-update of each team's high score.

Third and most important question: How do I get Oracle (or any SQL database, for that matter) to what I want here?

UPDATE: further reading suggests I probably need to do some explicit table locking, as in (https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_9015.htm) -- but I'm not clear on exactly what I need. HALP!?

Community
  • 1
  • 1
JVMATL
  • 2,064
  • 15
  • 25

2 Answers2

4

Wow, long question. The short answer is that READ_COMMITTED is all you need.

You will not get a lost-update because the UPDATE performed by transaction B will restart after transaction A commits. The UPDATE will be read-consistent as of the point in time it was restarted, not the point in time that is was submitted.

That is, in your example, transaction B will update 0 rows in HIGH_SCORES.

There is a good example of this in Chapter 9 of the Oracle Concepts guide, demonstrating how Oracle protects applications from lost-updates.

There is a good explanation of how and why Oracle will internally restart UPDATE statements for read-consistency by Tom Kyte, here: https://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:11504247549852

Matthew McPeak
  • 17,705
  • 2
  • 27
  • 59
  • Thanks for taking the time to read and work through my epic-length question and educating me. :) – JVMATL Apr 01 '19 at 18:23
3

is this a correct understanding of READ_COMMITED?

Not quite. Your scenario is not what is shown in table 9-2 in the documentation you linked to. Your scenario is essentially what is in table 9-4.

The difference is the 9-2 version (showing a lost update) doesn't check the value being updated - it doesn't filter on the existing salary, which is the column it's updating. The 9-4 version is updating a phone number but looking at the existing value for that column as part of the update, and the blocked update ends up not updating any rows because it re-reads the newly-changed value - which now doesn't match the filter.

Essentially the blocked update is re-run when the lock previous lock is removed, so it re-reads the newly committed data, and used that to decide whether the row now needs to be updated.

As that document also says:

Locks achieve the following important database requirements:

  • Consistency

The data a session is viewing or changing must not be changed by other sessions until the user is finished.

  • Integrity

The data and structures must reflect all changes made to them in the correct sequence.

Oracle Database provides data concurrency, consistency, and integrity among transactions through its locking mechanisms. Locking occurs automatically and requires no user action.

The last two sentences mean that you don't need to worry about it, and it's fairly easy to verify the behaviour by manually updating the same row in a table from two sessions.

And under automatic locks:

A DML lock, also called a data lock, guarantees the integrity of data accessed concurrently by multiple users. For example, a DML lock prevents two customers from buying the last copy of a book available from an online bookseller. DML locks prevent destructive interference of simultaneous conflicting DML or DDL operations.

In your case when it restarts the update that was blocked in your transaction B, it doesn't find a row for team 1 where highscore is less than 125. The data the statement executes against includes the committed updated from session A, even though that commit happened after B first identified and asked for the lock on the row which - at that point - did match its filter. So there is nothing for it to update, and the update from session A is not lost.

Alex Poole
  • 183,384
  • 11
  • 179
  • 318
  • Thank you for taking the time to give such a thorough answer (which I just upvoted.) Both answers were helpful, but ultimately I selected the other because the first sentence answered my most important question explicitly (i.e. just use READ_COMMITTED and it will work) and because the second paragraph gets right to the heart of the main piece of information I was missing -- the fact that the transaction gets *restarted* and so we are looking at a new 'start time' when it restarts. It hadn't occurred to me that restarting was an option, so I would never have puzzled this out on my own. – JVMATL Apr 01 '19 at 18:21
  • You don't have to explain yourself *8-) Matthew's answer was rather more succinct, and overall you got what you needed, so everyone wins! – Alex Poole Apr 01 '19 at 18:26