2

I am having great difficulty in piecing all the information together. So let me explain as much detail as I can.

Working on a spring-jdbc, spring-transaction, MySQL db application, I want to be able to lock rows of data from a table while selecting for update. Reason being, I have business logic that transforms data at different stages. So if the application instance (on server-1) picks up record for processing, then it should not be picked up by another instance. This is somewhat similar to this question. But the answer is not acceptable to me for the same reasons as the OP had for that question.

So after carefully examining the spring-transaction reference doc and learning about how to configure transaction management with jdbcTemplate, my spring application set up looks as follows (only relevant part):

applicationContext.xml

...
...
<context:annotation-config />
...
<context:component-scan base-package="org.foo.bar"/>
...
<!-- Enable Annotation based Declarative Transaction Management -->
<tx:annotation-driven proxy-target-class="true" transaction-manager="transactionManager" />
<bean id="transactionManager"
      class="org.springframework.jdbc.datasource.DataSourceTransactionManager">
    <property name="dataSource" ref="dataSource" />
</bean>
...
<bean id="dataSource"
      class="org.apache.commons.dbcp2.BasicDataSource"  destroy-method="close">
    <property name="driverClassName" value="com.mysql.jdbc.Driver" />
    <property name="url" value="jdbc:mysql://dbserver:3306" />
    <property name="username" value="foo" />
    <property name="password" value="baz" />
    <property name="initialSize" value="10" />
    <property name="maxTotal" value="20"/>
</bean>

Service Class with target method: TransactionSvcImpl.java

public class TransactionSvcImpl implements TransactionSvc {
    @Autowired
    DatabaseAccessService dbAccessService;

    @Transactional(isolation = Isolation.SERIALIZABLE, propagation = Propagation.REQUIRES_NEW)
    @Override
    public List<Foo> getFooForUpdate() {
        // Below call executes a SQL like - "SELECT * FROM some_t WHERE id = 1 FOR UPDATE"
        List<foo> foos = dbAccessService.getSomeRecord();
        dbAccessService.updateTheRecord(foos, Status.PROCESSING);
        return foos;
    }
}

Now, I tested how isolation level works in MySQL by following what is described in this answer. However to make "select for update" work I had to change the following properties on the database server itself:

SET tx_isolation = 'SERIALIZABLE';
SET AUTOCOMMIT=0;

Without changing these settings, I could not get SELECT ... FOR UPDATE to work.

Once I tested it manually, I wanted to see if this has any effect when my application code runs. So while I was keeping one of mysql commandline session active, I started my application and put a breakpoint at the dbAccessService.updateTheRecord(); line. Then in the other session, I tried to select this row (no select for update, just a simple select) statement. Now I could see that the row was locked because of a previous transaction that was started by my application code.

Questions:

  1. To achieve the row locking (SELECT ... FOR UPDATE behavior) through spring-transaction do I have to change settings on the database server? Like the isolation_level and autocommit?
  2. Lets say MySQL server is running under defaults (ISOLATION_LEVEL = REPEATABLE-READS, AUTOCOMMIT=1). Now will this still work? Meaning, if I configure transaction in spring on target method as I have done above, will a row being read by one instance of application stay locked so that any other instance trying to read that row will be blocked?
  3. How does the configuration that we add on the @Transactional annotation make it to (or affect) the database? I know we can set transaction characteristics for different scopes as described in table 13.9 of the MySQL documentation here. Basically below is my imagination or guess about how all this works. Is it correct?

How it all works?

  • The class/method annotated with @Transactional will result in a proxy object being created so that all the transactional stuff is added along with businiess logic the user implemented.
  • The database connection is acquired and a database session is created.
  • For this session, based on the configuration provided in @Transactional, appropriate isolation level is set (don't know how autocommit will be affected)
  • Once the transaction is completed, session closes. Any new sessions created will set the isolation characteristics accordingly.

    1. Is there a better way for me to achieve the desired SELECT ... FOR UPDATE functionality with spring-jdbc, spring-tx? I could probably introduce my own locking mechanism like, introduce a new boolean column like selected_for_processing and instead of selecting, update the records to be selcted by setting the selected_for_processing column. In the next step I would select based on this flag and another status criteria. So this way a record will only be processed once. But I want to know if there is a conventional or configuration way to achieve this? I cannot use JPA or any spring-data-x libraries. spring-data- libraries work with springframework 5.x.x or above, but I am stuck with springframework 4.2.3, hence spring-jdbc, spring-tx.
djpanda
  • 835
  • 10
  • 20

2 Answers2

1

The thing that is making a difference is auto-commit. Normaly auto-commit is ON (each query is a transaction on it's own), and to use transactions you need to use BEGIN or START TRANSACTION in your code, do what you need to do in a transaction, and then explicitly call COMMIT.

1) Settings on the server for transaction_isolation will change the defaults on the server. But you can change them in the local session just before you begin the transaction, without changing the server wide default.

2) Yes, tx_isolation can differ between different sessions.

3) Not sure I can answer that fully as it seems to be Spring specific and I'm not fluent in that. In general, however, Spring has no real way of ensuring or implementing transactionality and locking at database level without explicitly locking entire tables, so ultimately all it can do is use underlying database features, and trying to be cleverer than the underlying database inevitably comes with an enormous performance and concurrency penalty.

Gordan Bobić
  • 1,748
  • 13
  • 16
  • Thanks for answering. It sounds a bit unrealistic to turn off auto-commit no? I mean imagine if for every crud operation you implement in your application you had to call commit explicitly? I just finished working in a Django API project and we just expect the ORM to manage commits and rollbacks for us. So are you saying, if we want to enforce explicit locking the only way to do that is by turning off auto-commit and call commit ourselves? – djpanda May 06 '20 at 11:15
  • And for question 3, forget spring. Like take any application development framework that offers working with transactions. Something in someway should be telling the db engine - "Hey I am a client connected to you and for the remainder of this session, set the isolation level to ```foo``` and some other thing to ```bar```. Is that how it happens? – djpanda May 06 '20 at 11:16
  • 1
    No, I am saying that if you need transactions, you should explicitly wrap each transaction in `START TRANSACTION ... COMMIT`. That **should** be what your framework's transaction handling does. – Gordan Bobić May 06 '20 at 11:38
  • even I thought thats what framework does. But it seems like it isn't enough. I have to explicitly set auto commit to OFF on the db server for this to work. And like you said, this is probably more specific to Spring. – djpanda May 06 '20 at 16:30
1

Auto-commit has to be off for locking to work.

If it's ON, every SQL statement is its own transaction. So your SELECT statement will acquire lock, but it will release it right away after doing the select, before your update statement is run. If your dbAccessService bean is say, a hibernate DAO, the SQL statement it issues is probably an all-columns update, similar to the behavior of PUT in REST (as opposed to PATCH). And this can lead to existing updates being entirely replaced.

Tx A Tx B
SELECT where id = 1 for update tries the same but lock is with tx A so is blocked waiting for lock
release lock (because autocommit) gets lock for select where id = 1
tries to update but got blocked (UPDATE implicitly gets lock) release lock
UPDATE foo SET col_a = val_a, col_b = val_b, status = 'PROCESSING' waiting for lock
update done, release lock UPDATE foo SET col_a = val_a, col_b = val_b, status = 'PROCESSING'

In the above table, time flows from left to right, top to bottom.

The thing to note is that, UPDATE statement does acquire lock, but the columns/values of UPDATE statement is controlled by the application, and the application in most situations would need to do a SELECT first to know what values to update, so you should get lock for the whole duration to prevent other transactions from messing with your row while you are transitioning from your SELECT to subsequent UPDATE.

  1. Most of the time you don't need to mess with DB settings, unless your DB has some custom settings. By default, Spring set autocommit off for its connection and it uses the DB default tx isolation level. The default REPEATABLE READ for MySQL is good enough for most cases. I have never needed to set my isolation level to SERIALIZABLE.

  2. Yes, @Transactional by default set autocommit to off. But you probably don't need SERIALIZABLE. To block another tx, that tx (lets call it tx B) needs to acquire lock as well.

Example:

Tx A Tx B (not acquiring lock on select)
SELECT where id = 1 where status = 'INIT' for update SELECT where id = 1 where status = 'INIT' (this select does not acquire lock hence is not be blocked)
application reads the result and prepare to issue update application reads the result and prepare to issue update
application reads the result and prepare to issue update update is blocked waiting for lock
UPDATE foo SET col_a = 'a', col_b = 'b', status = 'PROCESSING' update is blocked waiting for lock
update done, commit tx, release lock UPDATE foo SET col_a = 'another_a', col_b = 'another_b', status = 'PROCESSING'
commit tx

Now Transaction A's update is lost forever!

Some tricky scenarios

  1. You want to achieve gap lock. E.g. prevent a new insert with status = 'processing'.
  2. Both transaction are not running the same SELECT statement (e.g. tx A: 'where id = ?', tx B: 'where name = ?':

For 2), to get the 2 tx to block each other (which is what you want), you probably need to read up on lock behavior with regards to index. If both transactions are locking based on the same column that has an index, you are most likely fine. An easy way to test this is to fire up 2 CLI session:

CLI 1 CLI 2
begin; begin;
select ... for update; select ... for update;
rollback/commit; rollback/commit;

Interleave the CLI1 and CLI2 statements and see if they are blocked. If CLI2 could not progress with select statement after CLI1 has run the select statement then you have 2 queries that will block each other.

This is probably the best introduction with its nice graphics:

https://vladmihalcea.com/a-beginners-guide-to-database-locking-and-the-lost-update-phenomena/

HongYi
  • 61
  • 1
  • 1
  • 3