7

I'm building an application that involves significant image processing. We're building it in a distributed manor, with an arbitrary number of rendering machines receiving requests to render images from a RabbitMQ exchange.

Because a request for an image might come in while a rendering is already in process, and because I don't want to have two render servers duplicating work, I created a column in the mySQL images table called is_rendering as a boolean.

When a render server receives a request for render it follows a series of steps that looks like this:

  1. Select for update the image row
  2. If is_rendering == true abort render request
  3. Set is_rendering == true and commit the transaction
  4. Render the image and store the thumbnail in a globally accessible store
  5. Set is_rendering == false and return

It definitely works, but I'm worried that these frequent database updates seem a little bit silly. Furthermore I'm considering the edge case where a render server fails in the middle of a rendering and leaves is_rendering == true, preventing that image from ever being rendered. The solution I'm considering to this problem is to change the is_rendering column from a tinyint(1) to a datetime field, and storing the date of locking as the "true" value, and null as the "false" value. A periodic service health check could select all rows with a is_rendering value past a certain time period and release the lock in this case.

Is this a sane approach to this problem, or are there other more elegant approaches I should be considering?

Andrew
  • 203
  • 3
  • 6
  • I have implemented the datetime field lock approach several times successfully – ilanco Apr 28 '12 at 22:40
  • I don't see any other way of locking, other than introducing such a column. Only I would make it a status, like `0 = not_rendered, 1 = rendering, 2 = rendered`. But maybe I misunderstood your workflow a bit. And of course, using `InnoDB` transactions will ensure that only one concurrent query can set an image to rendering state. – Dmytro Shevchenko Apr 28 '12 at 22:40
  • Whether the approach itself is sensible I don't know but as an alternative to a column a separate 1:1 `is_rendering (int)` table seems more manageable to me. Just add/remove records and you won't have to scan all image records in the health check. If you need the image itself at the same time, though, it may be extra work. – mkjeldsen Apr 28 '12 at 22:43
  • @Quail well, having a separate `(id, is_rendering)` index is the same as having a separate table, in regards to performance. – Dmytro Shevchenko Apr 28 '12 at 22:44
  • @Shedal - So this approaches still don't solve the edge case of a server dying in the middle of rendering and leaving an image locked. If a server dies, there's no clear way to have the lock expire, short of embedding the date in it and having a secondary service that takes a look at all currently held locks and clears ones that have been held past a reasonable render time. – Andrew Apr 28 '12 at 22:47
  • @Andrew you can also have the last modified (last locked) date as a separate column. – Dmytro Shevchenko Apr 28 '12 at 22:47
  • What's the application written in? This seems like the perfect use for a semaphore, which is stored in memory (or disk if using a file lock). If the system fails, locks will be released. Only thing that would hang it would be a hung thread that has the lock. – Ami Apr 29 '12 at 03:24

2 Answers2

0

Dear As I understand your problem, Your first approach is also correct if you are following below rule: 1) your table type is innoDB. 2) You are using transaction in your code. Because it will rollback if any breakup happend during updation.

In the end, your 2nd approach is also better. if you are not fulfilling my mentioned points

StormMan
  • 193
  • 6
  • I am using InnoDB. I don't think it's wise to hold onto the transaction for the entire rendering. Rendering can take 30+ seconds to complete and I'm using connection pooling. Holding onto the transaction means I have to tie up a connection that could be put to use elsewhere. – Andrew Apr 28 '12 at 22:49
0

I've gone ahead and changed the implementation to use a DATETIME column.

I was really curious to here if this was a poor use of mySQL in general. From what I've researched, I could go with something like Hadoop's ZooKeeper: http://zookeeper.apache.org/doc/r3.1.2/recipes.html or something like Google's internal Chubby system. Since this is just the first iteration of the service I'm going to stick with mySQL.

From what I've read here, and what I've read online using mySQL as a way to produce a global lock isn't a terrible idea, and changing it to a DATETIME column, while a little obtuse, is the best way to implement an expiration policy to handle the odd edge case of a machine shutting down in the middle of processing a job.

Holding onto a transaction-level lock would be another approach, but it doesn't make sense when a single server is running many threads with a small connection pool, it would tie up the connection unnecessarily, although it has a built-in expiration when the client connection is lost.

Andrew
  • 203
  • 3
  • 6