0

let's suppose that I have an endpoint /seen/{bigint_number}, which is about 10K concurrently visits that with a random bigint number.

the logic is simple. if the number is already stored in the database, it returns true, if the number has not been stored yet, it got stored and returns false.

the logic is first "select * from myTable where number = bigint_number" if found return true, else insert into the table.

the race condition is here when the same concurrent user has the same number.

how we can avoid this?

Yuseferi
  • 7,931
  • 11
  • 67
  • 103
  • Just INSERT the number and when it fails because of a unique constraint, you know that it already exists. This also avoids issues with concurrency. – Frank Heikens May 18 '22 at 09:36
  • @FrankHeikens let's suppose user1 comes with x, it's not in the database. at the same time, user2 comes with x. what is the scenario? return false for both of them? – Yuseferi May 18 '22 at 09:54
  • Why don't you try it? The first one will succeed, the second one rejected. That's how UNIQUE constraints work. – Frank Heikens May 18 '22 at 09:55
  • @FrankHeikens sorry but which this scnario both return false. we need a transaction here. – Yuseferi May 18 '22 at 09:59
  • -1 for using UNIQUE constraints violation to handle this type of situation. It is equivalent to using Exceptions to control you program logic. See my proposal below. – agulowaty May 18 '22 at 09:59
  • 2
    @agulowaty: The UNIQUE constraint can be handled using ON CONFLICT DO NOTHING if you wish. That's up to you. But first I would get my logic right because selecting a random number from a huge table, doesn't right to me. – Frank Heikens May 18 '22 at 10:05
  • @FrankHeikens just for your clarification: user 1 { select return nothing then it goes to insert insert will be done and returns false to user } user2 ( it's the same time as user1 so the record still is not there) { select return null, go to insert insert failed and finaly return false } then the result is false for both of them which is not correct. – Yuseferi May 18 '22 at 10:19
  • @agulowaty I got your point, for this case that's true, we can skip the select and just direct insert, if insert failed it means it's there and return true, if it runs successfully it means the record wasn't there and return false to the user. but let's suppose the select is not in the same table, the select is on another table. mostly I want to know how we can create a transaction with select + insert in postgres in a situation that race condition could happen? – Yuseferi May 18 '22 at 10:25

2 Answers2

1

You need a unique constraint on the number column.

Then you can proceed like this:

WITH x AS (
   INSERT INTO mytable (number) VALUES (12346)
   ON CONFLICT (number) DO NOTHING
   RETURNING number
)
SELECT count(number) = 0
FROM x;

The INSERT statement will return a row if a row was inserted, so the query will return FALSE in that case.

This is free from race conditions, because INSERT ... ON CONFLICT is.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
0

I believe what you're looking for is UPSERT operation: https://wiki.postgresql.org/wiki/UPSERT

Why you're afraid of race condition? Databases have set of tools like transactions and locks to help you with these problems.

agulowaty
  • 191
  • 7
  • thank you for your attention, I don't want to update anything. it's just a select + insert. I think the solution is possible with transactions, but I want more details about it . – Yuseferi May 18 '22 at 09:52
  • I'm not sure why you insist on selecting the rows first? You can do `INSERT INTO .. ON CONFLICT DO NOTHING`. `INSERT` expression should return you a number of inserted/updated records you can inspect and return your boolean to the user? – agulowaty May 18 '22 at 09:57
  • The wiki link is a bit misleading as it talks about the process how this feature was implemented. It does not document how this feature actually works in Postgres. –  May 18 '22 at 10:19