3

I'd like to have a SQL statement or function in Postgres 10 that allows me to insert a row only if there are less than a certain number of rows already.

select count(*) from mytable where mykey = 1
--- do the following only if the above is less than 5
insert into mytable (myvalue, mykey) values ('randomvalue', 1)
--- so there are never more than 5 rows in mytable with mykey = 1

Will something like this pseudocode work from the application (multiple roundtrip calls to the postgres server)?

tx = app.tx("start transaction");
count = tx.query("select count(*) from mytable where mykey = 1")
if (count < 5) {
  tx.query("insert into mytable (myvalue, mykey) values ('randomvalue', 1)")
}
tx.do("commit")

If it won't work, how could I do this on the Postgres side, so I only make one call from the application, like select myinsertfunction('randomvalue', 1)?

Whether it's the multiple roundtrip way of doing it above, or a single call to postgres, the most important thing is that there's no way that this can run in parallel in such a way that more than 5 rows are inserted. For example, transaction 1 and 2 both check that the count is 4 (less than the max of 5), and proceed at the same time, so both of them will end up inserting 1 row each leading to a total of 6 rows instead of the maximum of 5.

user779159
  • 9,034
  • 14
  • 59
  • 89
  • https://dba.stackexchange.com/questions/167273/how-to-perform-conditional-insert-based-on-row-count might be an interesting read for you :) – dnswlt Feb 11 '18 at 12:24

1 Answers1

8

This problem is known as Phantom Read:

A transaction re-executes a query returning a set of rows that satisfy a search condition and finds that the set of rows satisfying the condition has changed due to another recently-committed transaction.

Try

BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
INSERT INTO mytable (myvalue, mykey) SELECT 'randomvalue', 1 WHERE
    (SELECT COUNT(*) FROM mytable WHERE mykey = 1) < 5;
END;

The transaction isolation level will ensure that transactions will only insert values if the count is less than 5.

clemens
  • 16,716
  • 11
  • 50
  • 65
  • Wouldn't setting the transaction level to repeatable read work too? Could you describe a scenario where it wouldn't work but serializable would? – user779159 Feb 11 '18 at 12:33
  • No, that wouldn't work in general, because `REPEATABLE READ` (or lower) doesn't _protect_ the whole table from modifications. And because you use `COUNT` to access the entire table, you must prevent other processes from changing it. But this doesn't matter: "When you select the level Read Uncommitted you really get Read Committed, and when you select Repeatable Read you really get Serializable" (see the linked docs above). – clemens Feb 11 '18 at 12:38
  • What are the costs of using serializable in this case (as opposed to the default read committed). For example would all other queries that read or write from `mytable` block until this transaction commits or rolls back? – user779159 Feb 11 '18 at 12:48
  • No, not necessarily. Transaction serialization is very hard to explain in a comment, but there are different sources you could read, e.g.: [Wikipedia](https://en.wikipedia.org/wiki/Serializability) or [Postgres](https://www.postgresql.org/docs/8.0/static/transaction-iso.html#XACT-SERIALIZABLE). – clemens Feb 11 '18 at 12:59
  • While this answer works, can someone explain to me why the OP's pseudo code doesn't work? In other words, why is checking the count in the application layer not OK, even if we use serializable transactions? I'm asking because I ran into this exact issue, where I tried checking the count in the application layer and it didn't work. – Derek Chiang Mar 31 '20 at 06:52
  • @DerekChiang The problem in OPs code is not the application layer, it‘s the wrong isolation level. Setting it to SERIALIZABLE will avoid phantom reads. – clemens Mar 31 '20 at 07:38