2

I'm trying to get an understanding of how SSI is actually supposed to behave in Postgres. My understanding is, if I have two transactions interacting with the same table, but the transactions aren't interacting with the same rows in the table, then no exception will occur.

However, I'm running the following test where transaction one does the following:

cur = engine.cursor()
cur.execute('SELECT SUM(value) FROM mytab WHERE class = 1')
s = cur.fetchall()[0][0]
print('retrieved sum is...')
print(s)
print('sleeping....')
time.sleep(10)
cur.execute('INSERT INTO mytab (class, value) VALUES (%s, %s)', (1, s))
engine.commit()

While this above first transaction is sleeping, I run the second transaction:

cur = engine.cursor()
cur.execute('SELECT SUM(value) FROM mytab WHERE class = 2')
s = cur.fetchall()[0][0]
print('retrieved sum is...')
print(s)
cur.execute('INSERT INTO mytab (class, value) VALUES (%s, %s)', (2, s))
engine.commit()

In this case, the second transaction is only touching rows with class = 2, while the first is only touching rows with class = 1. Yet this is causing the first transaction to fail with the following exception:

could not serialize access due to read/write dependencies among transactions
DETAIL:  Reason code: Canceled on identification as a pivot, during write.
HINT:  The transaction might succeed if retried.

For reference mytab is very simple and looks like this:

class   value
1   10
1   20
2   100
2   200

Aside from the standard engine = psycopg2.connect set up, I'm also setting the transaction isolation level using this line prior to running the above code:

engine.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_SERIALIZABLE)
Jackson Kelley
  • 412
  • 5
  • 13
  • Do you mean SERIALIZABLE isolation? And have your read the section of the manual covering it? https://www.postgresql.org/docs/11/transaction-iso.html#XACT-SERIALIZABLE – Richard Huxton Sep 03 '19 at 18:54
  • Yes, I mean SERIALIZABLE isolation (my understanding was Postgres uses SSI to implement SERIALIZABLE isolation, if that isn't the case that would also be good to know). I have read the manual, and that's where I got the `mytab` example shown in the question. However, the example in the documentation shows two transactions altering rows in intersecting sets. In my case, I'm testing the case where the rows belong to two disjoint sets. – Jackson Kelley Sep 03 '19 at 19:11
  • I thought it looked familiar from somewhere. Be aware that the guarantee runs in the opposite direction. If the system cannot prove that the transactions are safe then it guarantees to throw an exception. The next question is why it can't figure out your second transaction is safe,and that I'm afraid isn't immediately clear to me. – Richard Huxton Sep 03 '19 at 19:18

1 Answers1

2

Your understanding is pretty much correct, but the SSI algorithm is not perfect, so there is always some risk of false positives (for example, as noted in the docs, row locks may be combined into a page lock, optimising for memory at the cost of precision).

The behaviour here is a limitation of the predicate locking implementation, namely that:

For a table scan, the entire relation will be locked.

Basically, after your first query WHERE class = 1 has been run, future inserts from other transactions need to be checked to see if they would have satisfied this condition had they been visible. Actually performing this check is impractical or impossible for all but the simplest conditions, so to err on the side of caution, a predicate lock is taken on the whole table instead.

The fine-grained predicate lock implementation is based on indexing, as it's much easier to reason about the affected subset of the relation in terms of e.g. B-tree ranges than in terms of arbitrary WHERE constraints.

In other words, if you have an index on your class column - and enough records in your table for the planner to actually use it - you should get the behaviour you expect.

Nick Barnes
  • 19,816
  • 3
  • 51
  • 63