2

I have a single table that I store information that I need to batch and send out every few hours. Each row in this table corresponds to one item I need to batch. Every few hours, I do a query for all new rows where the state = 'New'.

When I start batching these rows, I want to set the state of the rows to 'Batching', and once I've finished batching them, I want to set the state to 'Finished'.

Will the following SQL be more efficient:

   UPDATE BatchTable SET state = <random number> where state = 'New';
   SELECT * from BatchTable where state = <random number>;

By doing an UPDATE first, would it avoid the situation where two threads do a SELECT and then possibly have to deal with being blocked while another thread is doing a batching/updating of the rows? If I do an UPDATE, will it avoid blocking from other threads?

Is there an entirely different, better way to do this?

Foggzie
  • 9,691
  • 1
  • 31
  • 48
steve8918
  • 1,820
  • 6
  • 27
  • 38
  • Which RDBMS are you using? Can't you use a trigger to put records into `BatchTable` whenever a new record is inserted? – Yuck Feb 06 '13 at 23:24
  • This design is flawed in that a record could be `'New'` and then change to another `state` before you poll the data to include it in `BatchTable`. – Yuck Feb 06 '13 at 23:25
  • Consider use of "select for update", here the reference for mySql and Oracle : http://dev.mysql.com/doc/refman/5.0/en/innodb-locking-reads.html http://stackoverflow.com/questions/5847228/oracle-select-for-update-behaviour – Osy Feb 06 '13 at 23:27
  • @Yuck - I don't think it's flawed because I'm assuming that each thread that is batching gets all the rows that are currently 'New' when it issues the UPDATE. If subsequent rows are added, then it will be handled on the next batch call. If another thread sets the state to then it's assumed that thread will batch that row out, so this avoids duplication. – steve8918 Feb 06 '13 at 23:28
  • @Osy - If I have multiple threads doing a SELECT for update, won't that cause blocking on the other threads until the first thread commits? – steve8918 Feb 06 '13 at 23:34
  • I suppose you mean : UPDATE BatchTable SET state = where state = 'New'; SELECT * from BatchTable where state = 'New'; – twillouer Feb 06 '13 at 23:36
  • @twillouer - no, I just want to get all the rows where the state = , because as far as that thread is concerned, those are the only rows it will handle. As Syn123 said, if new rows come in, I don't want them to be inadvertently added to the batch. – steve8918 Feb 06 '13 at 23:46
  • yes, my bad. You can see "optimistic locking" for your threading problem, and "select for update" for the performance. – twillouer Feb 07 '13 at 00:07

1 Answers1

1

I see your concern: while the select statement is executing, new rows may be added that get caught by the update but not the select. So... your method is the most straight forward here by introducing a 3rd value to the test. Another thing you can play with, and my apologies because I haven't used it a whole lot is: the OUTPUT clause: http://msdn.microsoft.com/en-us/library/ms177564.aspx . This would allow you do do an update from the select statement and get values back I believe all in one statement.

RandomUs1r
  • 4,010
  • 1
  • 24
  • 44