How does PostgreSQL implement the REPEATABLE_READ isolation level?

Question

The REPEATABLE_READ transaction isolation level of PostgreSQL 12 prevents dirty reads, non-repeatable reads, and phantom reads. In contrast to the READ_COMMITTED isolation level, the REPEATABLE_READ isolation level prevents non-repatable reads and phantom reads.

I guess that this comes with a cost, otherwise one would just make both equal. How does postgres guarantee that those 3 read phenomena don't occur?

You probably already found [Concurrency Control](https://www.interdb.jp/pg/pgsql05.html), which seems to talk about this from chapter 5.4 — Luuk, Oct 10 '21 at 16:09
@Luuk This link is AWESOME! If you post this (+ maybe add a few words), I would accept it. Otherwise I'll summarize it in a few days :-) — Martin Thoma, Oct 10 '21 at 19:33

score 10 · Accepted Answer · answered Oct 11 '21 at 05:51

10

READ COMMITTED and REPEATABLE READ are using the same technology: a snapshot that determines which of the versions of a row in the table a transaction can see. The difference is that with READ COMMITTED, the snapshot is taken at the start of each statement, so that each new statement can see everything that has been committed before, while a REPEATABLE READ transaction uses the same snapshot for all statements.

There are two consequences:

if anything, REPEATABLE READ is cheaper than READ COMMITTED, because it takes fewer snapshots
REPEATABLE READ provides even higher isolation than required by the SQL standard – the database does not seem to change at all

The price you are paying for REPEATABLE READ is different:

you risk serialization errors, which force you to repeat the transaction
VACUUM cannot clean up rows marked dead after the REPEATABLE READ transaction started

answered Oct 11 '21 at 05:51

Laurenz Albe

209,280
17
206
263

Since in SQL Server you can control that behaviour: Do you know of situations in PgSQL, where the snapshot space exhausted or did other strange things? – damike Apr 11 '22 at 06:38
1

No, that can never happen. Perhaps a snapshot in SQL Server is something different. A query snapshot in PostgreSQL is local to the connection that runs the query. It describes the transaction numbers whose data modifications the query can see. – Laurenz Albe Apr 11 '22 at 07:00
But PgSQL needs to store it somewhere? My understanding: REPEATABLE READ in the past caused a Shared-Lock in SQL Server - so no update was possible. So the introduced Snapshots: https://youtu.be/NObahFaNmz4?t=79 - looks exactly the same beahviour as in PgSQL. But in SQL Server it needs planing since its consumes ressources. – damike Apr 11 '22 at 09:24
1

Yes, the idea is the same (SQL Server figured out at some point that its default read locks were not such a great thing), but the implementation may be quite different. – Laurenz Albe Apr 11 '22 at 09:32

How does PostgreSQL implement the REPEATABLE_READ isolation level?

1 Answers1

Linked