0

Hello and happy New Year to all of you! :) I'm working on a software (Java, backend) that has 3 components each being independent Java application (you can even consider them as separate threads): 1) DB data importer 1. 2) DB data importer 2. 3) DB data exporter.

Now, all those apps are working with the same MySql database, using same 2 InnoDB tables - "Orders" and "Items". Each order may have 0 or many items. All DB operations (data query and data insert) are done using stored procedures. This is what my app does:

Application 1 or 2 (remember, they are independent) starts importing "an order" every few seconds. "Order" is being imported into "Orders" table and each "Item" is being imported into "Items" table. a) Application takes and order, imports that order into "Orders" table and marks the order as "not ready for export". b) Then application continues importing all items (if any) for that order. c) In the end, when all items are imported, order is marked as "ready for export" (i have a dedicated column for that bit).

Application "3" every few seconds does this: a) checks for "ready for export" orders. b) selects 50 "ready for export" orders and marks them as "in export". c) exports orders and theirs items into a file. d) marks all orders that are "in export" to "exported".

Now, as you can tell, even with dedicated column which tells which order or item should be exporter and which is still being imported and shouldn't be exported yet, i'm getting some deadlocks and race conditions.

Do you know any simple and safe mechanism i can use to implement this "producer - consumer" system without locking entire tables "Orders" and "Items" (as they are actively used by other parts of the software)? I guess using 3-state column is not a good idea taking into account each table may have millions of rows and index on such 3-state column is ineffective. There has to be something better :)

guest86
  • 2,894
  • 8
  • 49
  • 72
  • Quite honestly, if you have primary keys on each table, and a foreign key defined between the tables, and an index on the state column I don't think you should be getting deadlocks or race conditions. I'm assuming that orders are never being processed by the importer and the exporter at the same time, but from your description they shouldn't be. – SeanN Jan 02 '16 at 03:22
  • At the risk of asking the obvious, are you using transactions and `SELECT ... FOR UPDATE` to lock the *rows* you're manipulating? – Michael - sqlbot Jan 02 '16 at 03:52
  • No, i don't use "For update" and i'm using transactions only in few procedures. "For update" contributes to deadlocking :) – guest86 Jan 02 '16 at 12:25

1 Answers1

1

Your processing workflow procedure includes this step:

Application "3" every few seconds does this: a) checks for "ready for export" orders. b) selects 50 "ready for export" orders and marks them as "in export". c) exports orders and theirs items into a file. d) marks all orders that are "in export" to "exported".

It matters how you do this step.

If you do something like this procedure, it should run cleanly.

In my opinion, 50 is too big a batch to do at once. I'd start with doing them one by one, and then try to make the batches a little larger.

First, mark some orders as 'in export' in a single query. That way you won't have to worry about transactions.

UPDATE orders 
   SET status = 'exporting'
 WHERE status = 'ready-for-export'
 ORDER BY id
 LIMIT 50  

Then, in a loop do this to process all the orders in your match

 /* get an order to process */
 SELECT id, whatever, whatever 
   FROM orders
   WHERE status = 'exporting'
   ORDER BY id
   LIMIT 1

 /* if no order came back from this query, you are done with your batch */

 /* process the order */

 /* mark the order done */
 UPDATE orders SET status = 'exported' WHERE id = ???id??? AND status='exporting'

This grabs a batch of orders (using LIMIT 50) and marks them "ready-for-export." It then chews through them one by one.

Now, to avoid deadlocks on the order table, queries in the rest of your software need to include WHERE status <> 'exporting' or the equivalent, so they'll ignore the rows being exported.

It's possible you're getting deadlocks on the Items table. You can avoid those by always doing single-query operations like this

 UPDATE items SET number_on_hand = number_on_hand - ???order_quantity???

You might also look into MySQL's version of UPSERT. It's called INSERT ... ON DUPLICATE KEY UPDATE.

But, the bottom line is this: If you have a busy system, you're going to have to use transactions. The classic way of avoiding deadlocks is to always lock resources in the same order. For example,

  1. Always lock the Order table row first, or the Item table row first, but never in the opposite order.

  2. If you have to lock several rows in one table, always lock them in the same order -- that is, use an ORDER BY id clause in the SELECT ... FOR UPDATE query.

Notice that an index on the orders table, on (status,id), will help optimize the queries to put orders into batches and release them from batches.

O. Jones
  • 103,626
  • 17
  • 118
  • 172