0

I'm working on a project using a MySQL database as the back-end (accessed from PHP). Sometimes, I select a row, do some operations on it, and then update the record in the database.

I am worried that another user could have initiated a similar process on the same row right after the first select, and his changes could overwrite some of the changes the first user did (because the second user's select did not yet include those changes).

Is this an actual problem? Should I lock the table, and won't this severely impact my application's performance? Any other solutions?

Just to be thorough with my information, I also have some CRON jobs running that could also be modifying the same data.

Thanks!

SwammiM
  • 463
  • 6
  • 15
  • Use transactions, making sure that you're using an engine that is ACID compliant – Mark Baker May 31 '14 at 15:10
  • Thanks for the suggestion. So basically, I just set the engine to InnoDB, and then do "START TRANSACTION" before I do my select, and "COMMIT" after I do my update? Anything else I need to take into consideration, or is that it? – SwammiM May 31 '14 at 15:26
  • If you are updating more than one table in the transaction. Try and access the tables you are updating, in the same order, within the transactions. This reduces the chance of deadlocks. – Ryan Vincent May 31 '14 at 15:29
  • To address the issue of rows that may be updated by other users while you are checking them. Consider doing a 'select for update' at the start of the transaction on the rows that you may change. even if you do not update them, you can be certain that they cannot be changed while you are using them. – Ryan Vincent May 31 '14 at 15:46

1 Answers1

1

I can think of two solutions, other than explicitly using transactions:

  • Use SELECT .. FOR UPDATE : http://dev.mysql.com/doc/refman/5.0/en/innodb-locking-reads.html
  • Manually change a value so the row is not select by other queries:

    SET @update_id := 0;

    UPDATE table_name SET status = 'IN_PROCESS', id = (SELECT @update_id := id) WHERE status = 'WAITING' AND [your condition] LIMIT 1;

    SELECT @update_id;

Here, the rows to be selected must have the value of status="WAITING". And when this query runs, it selects the ID, and changes the value of 'status', so the row can't be selected by other queries.

Jehad Keriaki
  • 545
  • 5
  • 10
  • i thought that 'select... for update' only made sense inside a transaction due to resources being freed? – Ryan Vincent May 31 '14 at 18:35
  • As regards your second approach, imagine 10 updates happen at the same time. How do you know which one owns the row? I suggest that 'updates' are not 'atomic' outside of transactions or some other 'locking' strategy. – Ryan Vincent May 31 '14 at 22:53
  • Yes, 'Select .. for update' should be inside a transaction, or "auto_commit=0", which implicitly starts causes it to start a transaction. – Jehad Keriaki Jun 01 '14 at 02:37
  • As for the second point, UPDATE is atomic. When there are 10 connections asking for a row, the connection that gets the 'first' row with 'status="WAITING"', and change it before it releases the lock on the row to 'IN_PROCESS', so the next request can't get this exact row because 'status' field doesn't satisfy the condition. – Jehad Keriaki Jun 01 '14 at 02:41
  • This seems like a good answer to accept. I do have an additional question : When I use SELECT FOR UPDATE, what exactly happens to other processes that try to access/modify that row? Will they just wait until they become available, or will they fail? Thanks! – SwammiM Jun 02 '14 at 12:45
  • When a row is selected for update, it is locked, and other processes will wait till the lock is released to continue. This is valid [I tested it] on InnoDB tables. – Jehad Keriaki Jun 03 '14 at 03:56