1

I have a simple table with a primary key. Most of the read operations fetch one row by the exact value of the key.

The data in each row maintains some relationship with rows before and after it in the key order. So when I insert a new row I need to read the 2 rows between which it is going to enter, make some computation and then to insert.

The concern, clearly, is that at the same time another connection may add a row with a key value in the same interval. I am covered if it is exactly the same value of the key as the second insert would fail, but if the key value is different but in the same interval the relationship may be broken.

The solution seems to be to lock the whole table for writing when I decide to add a new row, or (if possible, which I doubt) to lock an interval of key values. Yet I'd prefer that read-only transactions would not be blocked at that time.

I am using ODBC with libodbc++ wrapper for C++ in the client program and IBM DB2 free edition (although the DB choice may still change). This is what I thought of doing:

  • start the connection in the auto-commit and default isolation mode
  • when need to add a new row, set auto-commit to false and isolation mode to serialized
  • read the rows before and after the new key value
  • compute and insert the new row
  • commit
  • return back to the auto-commit and default isolation mode

Will this do the job? Will other transactions be allowed to read at the same time? Are there other/better ways to do it?

BTW, I don't see in the libodbc++ i/f a way to specify a read-only transaction. Is it possible in odbc?

EDIT: thanks for the very useful answers, I had trouble selecting one.

davka
  • 13,974
  • 11
  • 61
  • 86

3 Answers3

2

If your database is in SERIALIZABLE mode, you won't have any issues at all. Given a key K, to get the previous and next keys you have to run the following queries:

select key from keys where key > K order by key limit 1;      # M?
select key from keys where key < K order by key desc limit 1; # I?

The above works in MySQL. This equivalent query works in DB2 (from the comments):

select key from keys where key = (select min(key) from keys where key > K);
select key from keys where key = (select max(key) from keys where key < K);

The first query sets up a range lock that prevents other transactions from inserting a key greater than K and less than or equal to M.

The second query sets up a range lock that prevents other transactions from inserting a key less than K and greater than or equal to I.

The unique index on the primary key prevents K from being inserted twice. So you're completely covered.

This is what transactions are about; so you can write your code as if the entire database is locked.

Note: This requires a database that supports true serializability. Fortunately, DB2 does. Other DBMS's that support true serializability: SQLServer, and MySQL/InnoDB. DBMS's that don't: Oracle, PostgreSQL!

Seun Osewa
  • 4,965
  • 3
  • 29
  • 32
  • 1
    What is "true serializability"? – Quassnoi Nov 11 '10 at 22:30
  • Thanks. Unfortunately, DB2 does not support `LIMIT 1`. I use `ORDER BY` and take the first row. Would it lock the entire table? I thought of changing it to `SELECT * from mytable WHERE key=(SELECT max(key) FROM mytable where key – davka Nov 14 '10 at 10:44
  • Just fixed my queries; forgot the ORDER BY clause. the '''SELECT max(key) FROM mytable where key – Seun Osewa Nov 15 '10 at 01:59
  • `select key from keys where key > K order by key fetch first 1 rows only` should also work in DB2. – Seun Osewa Nov 15 '10 at 02:07
  • 1
    @Quassnoi `Snapshot isolation is called "serializable" mode in Oracle[2][3][4] and PostgreSQL versions prior to 9.1,[5][6] which may cause confusion with the "real serializability" mode. There are arguments both for and against this decision; what is clear is that users must be aware of the distinction to avoid possible undesired anomalous behavior in their database system logic.` from [Wiki](http://en.wikipedia.org/wiki/Snapshot_isolation) – elif Aug 30 '12 at 11:38
1

If your database and storage engine allow that, you should issue SELECT FOR UPDATE for both rows you are trying to insert between.

This will conflict with any concurrent SELECT FOR UPDATE.

The downside is that a lock of rows 10 and 12 (to insert 11) will also prevent selecting 8 and 10 (to insert 9).

InnoDB in MySQL can also place a next-key lock on the index, that is lock of the index record and the gap between the next record.

In this case, you would only need to issue a SELECT FOR UPDATE on the first row and thus insert concurrently a row before that.

However, this requires forcing the index and providing a range condition on the index which may or may not be possible depending on your query.

Quassnoi
  • 413,100
  • 91
  • 616
  • 614
  • Thanks a lot! A few clarifications, please: I suppose I need to be in the "manual commit" mode, right? Is default isolation mode (Read Committed) is ok for this? Will other `SELECT` (not for update) statements be able to read the locked rows? – davka Nov 11 '10 at 14:31
  • @davka: manual commit, definitely. For `InnoDB`, default isolation mode is `REPEATABLE READ` which is required for gap locks, if you don't want them, any isolation mode will do. The concurrent statements without `FOR UPDATE` clause will be able to see the locked records (in `SQL Server`, you'll need to enable `SNAPSHOT ISOLATION` for that). – Quassnoi Nov 11 '10 at 14:35
  • The SQL standard recommends SERIALIZABLE, though. You won't need to do a SELECT FOR UPDATE in that case. – Seun Osewa Nov 11 '10 at 20:39
  • 1
    @Seun: how would you lock the records in `MVCC` engines (like `Oracle` or `PostgreSQL`) without `SELECT FOR UPDATE`? – Quassnoi Nov 11 '10 at 20:42
  • 1
    Oracle and PostgreSQL's SERIALIZABLE modes are deliberately broken, so you need SELECT FOR UPDATE, but InnoDB is also a MVCC engine and it doesn't need SELECT FOR UPDATE, so I don't think MVCC is the problem. – Seun Osewa Nov 12 '10 at 22:54
  • @Seun: `InnoDB` is `MV`, but not `MVCC` :) It does use multiple versions of the records but not for concurrency control. The concurrency is still controlled through a dedicated lock table and readers can block writers and vise versa (namely, in `SERIALIZABLE` mode). – Quassnoi Nov 12 '10 at 23:14
1

Your general approach is correct. But you should use a SELECT statement that covers the two rows and all the possible rows in between. For example:

SELECT * FROM MYTABLE WHERE PKCOL BETWEEN 6 AND 10

In database systems with pessimistic locking and transaction isolation level serializable, this SELECT statement should prevent new rows to be inserted that would change the result of the SELECT.

fredt
  • 24,044
  • 3
  • 40
  • 61