30

I want to do the following in one go:

SELECT * FROM jobs WHERE status='PENDING';
UPDATE jobs SET status='RUNNING' WHERE status='PENDING';

So get all pending jobs, then set them as 'RUNNING' immediately after.

The reason I don't want to do it one after the other in two statements is that jobs could be added to the jobs table as 'PENDING' after the SELECT but before the UPDATE so I'd end up setting jobs as RUNNING even though I haven't grabbed it while it was in it's PENDING state.

Is there anyway to do this in one? So I want the result from SELECT and the UPDATE to happen on the fly.

Thanks.

tbh1
  • 651
  • 3
  • 9
  • 14

3 Answers3

71

Why not use the RETURNING clause and process both things in one single statement:

UPDATE jobs 
    SET status='RUNNING' 
WHERE status='PENDING'
RETURNING *

That way you will get all rows that were changed by the UPDATE with a single atomic operation.

  • See similar question and answer: http://stackoverflow.com/questions/5636006/postgresql-and-locking/5638753#5638753 – Ketema Jul 20 '11 at 22:30
  • Aha! Of course, it didn't even occur to me to use RETURNING for some reason. Thanks. – tbh1 Jul 21 '11 at 13:30
  • 2
    What if what you are returning is not just one table but multiple joins with other tables as well and columns from different tables? – Righto Jul 25 '17 at 08:08
22

In general, you should do it with one UPDATE statement. The UPDATE will normally not be affected by rows that could have changed while the UPDATE statement is running, however, it's good to read up on transaction isolation levels here.

Assuming you're using default setting of Read Committed, here is what it says:

Read Committed is the default isolation level in PostgreSQL. When a transaction runs on this isolation level, a SELECT query sees only data committed before the query began;

And in regards to UPDATE:

UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands behave the same as SELECT in terms of searching for target rows: they will only find target rows that were committed as of the command start time. However, such a target row may have already been updated (or deleted or locked) by another concurrent transaction by the time it is found. In this case, the would-be updater will wait for the first updating transaction to commit or roll back (if it is still in progress). If the first updater rolls back, then its effects are negated and the second updater can proceed with updating the originally found row. If the first updater commits, the second updater will ignore the row if the first updater deleted it, otherwise it will attempt to apply its operation to the updated version of the row. The search condition of the command (the WHERE clause) is re-evaluated to see if the updated version of the row still matches the search condition. If so, the second updater proceeds with its operation, starting from the updated version of the row. (In the case of SELECT FOR UPDATE and SELECT FOR SHARE, that means it is the updated version of the row that is locked and returned to the client.)

So in your scenario, one UPDATE should be fine.

Keep in mind too, that there is what's known as a SELECT FOR UPDATE statement, which will lock the rows you select. You can read about that here. A scenario where you would need to use this feature would be in a reservation system. Consider this example:

  1. Execute SELECT to find out if room XYZ is available for a reservation on date X.
  2. The room is available. Execute UPDATE query to book the room.

Do you see the potential problem here? If between steps 1 and 2 the room gets booked by another transaction, then when we reach step 2 we are operating on an assumption which is no longer valid, namely, that the room is available.

However, if in step 1 we use the SELECT FOR UPDATE statement instead, we ensure that no other transaction can lock that row, so when we go to UPDATE the row, we know it's safe to do so.

But again, in your scenario, this SELECT FOR UPDATE isn't needed, because you are doing everything in one statement and aren't checking anything ahead of time.

dcp
  • 54,410
  • 22
  • 144
  • 164
7
begin;
select * 
from jobs 
where status='pending'
for update
;
update jobs 
set status='running' 
where status='pending';
commit;
Clodoaldo Neto
  • 118,695
  • 26
  • 233
  • 260
  • I was thinking those statements are executed as a single unit with the out put of the first statement being the input for update but I was wrong. If you added an 'AND' clause to the first statement and left it out in the second statement....You would be in for disasterous update in the database. Try this only without the 'Commit' Keyword else you will mess up your data – herbert ichama Jun 12 '20 at 10:52
  • For my case the select Query is quite complex with joins and inner joins on tables and calculations and i want the update to do just that using the values from those calculations to update the database values – herbert ichama Jun 12 '20 at 10:59