Atomically retrieve batch of rows from Postgresql database

Asked Feb 09 '17 at 16:14

Active Feb 09 '17 at 17:58

Viewed 77 times

I have a Python script that will retrieve a batch of rows from a table in a remote Postgresql database, do some processing, and then store the result back to the database. I will have this script running concurrently on several different machines, so I need to make sure that two different instances of the script do not retrieve the same row from the table.

I can use SELECT ... FOR UPDATE, but I would still need to store, perhaps in a column of that table or in a different table, that those rows are being "worked on". If one instance of the script retrieves a batch of rows and starts processing them, another instance of the script could retrieve some of the same rows, unless I keep track of which rows are in progress.

What I need to do is retrieve a batch of rows AND update a table all in one atomic step.

edited Feb 09 '17 at 17:58

asked Feb 09 '17 at 16:14

ItsAmy

do you have a unique key in that table? – Vao Tsun Feb 09 '17 at 16:29
Yes, it has a unique primary key – ItsAmy Feb 09 '17 at 16:35
Use a SELECT FOR UPDATE as described [here](http://stackoverflow.com/questions/18879584/postgres-select-for-update-in-functions). – Serg M Ten Feb 09 '17 at 16:49
That looks useful, but it raises another issue for me, which I'll explain in an update to the question. Thanks! – ItsAmy Feb 09 '17 at 17:56

Atomically retrieve batch of rows from Postgresql database

0 Answers0