2

Foreign keys may be the best approach for this problem. However, I'm trying to learn about table locking/transactions, and so I'm hoping that we can ignore them for the moment.

Let's pretend that I have two tables in an InnoDB database: categories and jokes; and that I'm using PHP/MySQLi to do the work. The tables look like so:

CATEGORIES
id (int, primary, auto_inc)  |  category_name (varchar[64])
============================================================
1                               knock, knock

JOKES
id (int, primary, auto_inc)  |  category_id (int)  | joke_text (varchar[255])
=============================================================================
empty

Here are two functions, each of which is being called by a different connection, at the same time. The calls are: delete_category(1) and add_joke(1,"Interrupting cow. Interrup-MOOOOOOOO!")

function delete_category($category_id) {

    // only delete the category if there are no jokes in it
    $query = "SELECT id FROM jokes WHERE category_id = '$category_id'";
    $result = $conn->query($query);

    if ( !$result->num_rows ) {
        $query = "DELETE FROM categories WHERE id = '$category_id'";
        $result = $conn->query($query);
        if ( $conn->affected_rows ) {
            return true;
        }
    }

    return false;
}

function add_joke($category_id,$joke_text) {

    $new_id = -1;

    // only add the joke if the category exists
    $query = "SELECT id FROM categories WHERE id = '$category_id'";
    $result = $conn->query($query);

    if ( $result->num_rows ) {

        $query = "INSERT INTO jokes (joke_text) VALUES ('$joke_text')";
        $result = $conn->query($query);

        if ( $conn->affected_rows ) {
            $new_id = $conn->insert_id;
            return $new_id;
        }
    }

    return $new_id;
}

Now, if the SELECT statements from both functions execute at the same time, and proceed from there, delete_category will think it's okay to delete the category, and add_joke will think it's okay to add the joke to the existing category, so I'll get an empty categories table and an entry in the joke table that references a non-existent category_id.

Without using foreign keys, how would you solve this problem?

My best thought so far would be to do the following:

1) "LOCK TABLES categories WRITE, jokes WRITE" at the start of delete_category. However, since I'm using InnoDB, I'm quite keen to avoid locking entire tables (especially main ones that will be used often).

2) Making add_joke a transaction and then doing "SELECT id FROM categories WHERE id = '$category_id'" after inserting the record as well. If it doesn't exist at that point, rollback the transaction. However, since the two SELECT statements in add_joke might return different results, I believe I need to look into transaction isolation levels, which I'm not familiar with.

It seems to me that if I did both of those things, it should work as expected. Nevertheless, I'm keen to hear more informed opinions. Thanks.

EleventyOne
  • 7,300
  • 10
  • 35
  • 40
  • So you are trying to add a joke to a category that is currently empty? And since you are also deleting empty categories its deleting it first? Is this totally hypothetical to try to learn about locking, or is this actually happening? – robz228 Aug 09 '13 at 16:42
  • That's right. If the code is executed in a certain order, the category can be removed (cuz no jokes) but at the same time the joke can be added (cuz category is still there). This relates to some old code that I wrote, so it could happen, and I'm trying to use it to learn about locking/transactions. – EleventyOne Aug 09 '13 at 16:48
  • okay, thats why i was confused. just make sure add_joke() is always executed before delete_category() then and all of this can be avoided. – robz228 Aug 09 '13 at 16:54
  • Each function is being called from a different connection. If you have an idea for HOW to avoid the problem, please post an answer. – EleventyOne Aug 09 '13 at 17:30

1 Answers1

2

You can DELETE a category only if is no matching joke:

DELETE c FROM categories AS c
LEFT OUTER JOIN jokes AS j ON c.id=j.category_id
WHERE c.id = $category_id AND j.category_id IS NULL;

If there are any jokes for the category, the join will find them, and therefore the outer join will return a non-null result. The condition in the WHERE clause eliminates non-null results, so the overall delete will match zero rows.

Likewise, you can INSERT a joke to a category only if the category exists:

INSERT INTO jokes (category_id, joke_text)
SELECT c.id, '$joke_text'
FROM categories AS c WHERE c.id = $category_id;

If there is no such category, the SELECT returns zero rows, and the INSERT is a no-op.

Both of these cases create a shared lock (S-lock) on the categories table.

Demonstration of an S-lock:

In one session I run:

mysql> INSERT INTO bar (i) SELECT SLEEP(600) FROM foo;

In second session I run:

mysql> SHOW ENGINE INNODB STATUS\G
. . .
---TRANSACTION 3849, ACTIVE 1 sec
mysql tables in use 2, locked 2
2 lock struct(s), heap size 376, 1 row lock(s)
MySQL thread id 18, OS thread handle 0x7faefe7d1700, query id 203 192.168.56.1 root User sleep
insert into bar (i) select sleep(600) from foo
TABLE LOCK table `test`.`foo` trx id 3849 lock mode IS
RECORD LOCKS space id 22 page no 3 n bits 72 index `GEN_CLUST_INDEX` of table `test`.`foo` trx id 3849 lock mode S

You can see that this creates an IS-lock on the table foo, and an S-lock on one row of foo, the table I'm reading from.

The same thing happens for any hybrid read/write operations such as SELECT...FOR UPDATE, INSERT...SELECT, CREATE TABLE...SELECT, to block the rows being read from being modified while they are needed as a source for the write operation.

The IS-lock is a table-level lock that prevents DDL operations on the table, so no one issues DROP TABLE or ALTER TABLE while this transaction is depending on some content in the table.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828