4

I noticed that Rails can have concurrency issues with multiple servers and would like to force my model to always lock. Is this possible in Rails, similar to unique constraints to force data integrity? Or does it just require careful programming?

Terminal One

irb(main):033:0* Vote.transaction do
irb(main):034:1* v = Vote.lock.first
irb(main):035:1> v.vote += 1
irb(main):036:1> sleep 60
irb(main):037:1> v.save
irb(main):038:1> end

Terminal Two, while sleeping

irb(main):240:0* Vote.transaction do
irb(main):241:1* v = Vote.first
irb(main):242:1> v.vote += 1
irb(main):243:1> v.save
irb(main):244:1> end

DB Start

 select * from votes where id = 1;
 id | vote |         created_at         |         updated_at         
----+------+----------------------------+----------------------------
  1 |    0 | 2013-09-30 02:29:28.740377 | 2013-12-28 20:42:58.875973 

After execution

Terminal One

irb(main):040:0> v.vote
=> 1

Terminal Two

irb(main):245:0> v.vote
=> 1

DB End

select * from votes where id = 1;
 id | vote |         created_at         |         updated_at         
----+------+----------------------------+----------------------------
  1 |    1 | 2013-09-30 02:29:28.740377 | 2013-12-28 20:44:10.276601 

Other Example

http://rhnh.net/2010/06/30/acts-as-list-will-break-in-production

Chloe
  • 25,162
  • 40
  • 190
  • 357
  • Bonus: How do I test for this? [Integration tests](http://guides.rubyonrails.org/testing.html#integration-testing) allow multiple sessions but they are still in the same process, no? – Chloe Dec 28 '13 at 21:16
  • I maybe incorrect but rails provides this sort of functionality under: http://api.rubyonrails.org/classes/ActiveRecord/Locking.html have a read. If you want to prevent some changes from being overridden then you will want optimistic locking. – Deej Dec 28 '13 at 21:21
  • @David I read that in the [guide](http://guides.rubyonrails.org/active_record_querying.html#locking-records-for-update). Is optimistic locking the best practice? It says `This locking mechanism will function inside a single Ruby process. To make it work across all web requests, the recommended approach is to add lock_version as a hidden field to your form.` However, I'm using background processes as well as web. It doesn't say exactly how it's implemented or if there can still be race conditions. – Chloe Dec 29 '13 at 01:56
  • 1
    From my understanding of what I read it seems as though this is the approach. My understanding of locking is an example. You have two workers running a task in parallel and you don't want a particular task to override the other. This is where you would use pessimistic locking. I will provide an answer solution for you. – Deej Dec 29 '13 at 02:13

4 Answers4

9

You are correct that transactions by themselves don't protect against many common concurrency scenarios, incrementing a counter being one of them. There isn't a general way to force a lock, you have to ensure you use it everywhere necessary in your code

For the simple counter incrementing scenario there are two mechanisms that will work well:

Row Locking

Row locking will work as long as you do it everywhere in your code where it matters. Knowing where it matters may take some experience to get an instinct for :/. If, as in your above code, you have two places where a resource needs concurrency protection and you only lock in one, you will have concurrency issues.

You want to use the with_lock form; this does a transaction and a row-level lock (table locks are obviously going to scale much more poorly than row locks, although for tables with few rows there is no difference as postgresql (not sure about mysql) will use a table lock anyway. This looks like this:

    v = Vote.first
    v.with_lock do
      v.vote +=1
      sleep 10
      v.save
    end

The with_lock creates a transaction, locks the row the object represents, and reloads the objects attributes all in one step, minimizing the opportunity for bugs in your code. However this does not necessarily help you with concurrency issues involving the interaction of multiple objects. It can work if a) all possible interactions depend on one object, and you always lock that object and b) the other objects each only interact with one instance of that object, e.g. locking a user row and doing stuff with objects which all belong_to (possibly indirectly) that user object.

Serializable Transactions

The other possibility is to use serializable transaction. Since 9.1, Postgresql has "real" serializable transactions. This can perform much better than locking rows (though it is unlikely to matter in the simple counter incrementing usecase)

The best way to understand what serializable transactions give you is this: if you take all the possible orderings of all the (isolation: :serializable) transactions in your app, what happens when your app is running is guaranteed to always correspond with one of those orderings. With ordinary transactions this is not guaranteed to be true.

However, what you have to do in exchange is to take care of what happens when a transaction fails because the database is unable to guarantee that it was serializable. In the case of the counter increment, all we need to do is retry:

    begin
      Vote.transaction(isolation: :serializable) do
        v = Vote.first
        v.vote += 1
        sleep 10 # this is to simulate concurrency 
        v.save
      end
    rescue ActiveRecord::StatementInvalid => e
      sleep rand/100 # this is NECESSARY in scalable real-world code, 
                     # although the amount of sleep is something you can tune.
      retry
    end

Note the random sleep before the retry. This is necessary because failed serializable transactions have a non-trivial cost, so if we don't sleep, multiple processes contending for the same resource can swamp the db. In a heavily concurrent app you may need to gradually increase the sleep with each retry. The random is VERY important to avoid harmonic deadlocks -- if all the processes sleep the same amount of time they can get into a rhythm with each other, where they all are sleeping and the system is idle and then they all try for the lock at the same time and the system deadlocks causing all but one to sleep again.

When the transaction that needs to be serializable involves interaction with a source of concurrency other than the database, you may still have to use row-level locks to accomplish what you need. An example of this would be when a state machine transition determines what state to transition to based on a query to something other than the db, like a third-party API. In this case you need to lock the row representing the object with the state machine while the third party API is queried. You cannot nest transactions inside serializable transactions, so you would have to use object.lock! instead of with_lock.

Another thing to be aware of is that any objects fetched outside the transaction(isolation: :serializable) should have reload called on them before use inside the transaction.

Michael Johnston
  • 5,298
  • 1
  • 29
  • 37
0

ActiveRecord always wraps save operations in a transaction.

For your simple case it might be best to just use a SQL update instead of performing logic in Ruby and then saving. Here is an example which adds a model method to do this:

class Vote
  def vote!
    self.class.update_all("vote = vote + 1", {:id => id})
  end

This method avoids the need for locking in your example. If you need more general database locking check see David's suggestion.

Wizard of Ogz
  • 12,543
  • 2
  • 41
  • 43
  • Thanks I think that would work, however I only used that model as an example because the new fields don't exist yet. I'm actually going to modify User model and the balance, and don't want them to update their password or change name and such while the balance is being updated, plus there will be some complex math involved and lots of relationships. – Chloe Dec 29 '13 at 02:17
  • 1
    You are correct in what you we're saying but if you use transactions it works as a protective wrapper correct so that it ensures that changes to the DB only occur when all actions succeed together. – Deej Dec 29 '13 at 03:17
  • 1
    WARNING: transactions DO NOT magically protect against all concurrency issues. Depending on what you are doing specifically and what database/version you are using you will need to look at transaction(isolation: :serializable) and/or row locking. – Michael Johnston May 16 '14 at 05:18
0

You can do the following in your model like so

class Vote < ActiveRecord::Base

validate :handle_conflict, only: :update
attr_accessible :original_updated_at
attr_writer :original_updated_at

def original_updated_at
  @original_updated_at || updated_at 
end 

def handle_conflict
    #If we want to use this across multiple models
    #then extract this to module
    if @conflict || updated_at.to_f> original_updated_at.to_f
      @conflict = true
      @original_updated_at = nil
      #If two updates are made at the same time a validation error
      #is displayed and the fields with
      errors.add :base, 'This record changed while you were editing'
      changes.each do |attribute, values|
        errors.add attribute, "was #{values.first}"
      end
    end
  end
end 

The original_updated_at is a virtual attribute that is set. handle_conflict is fired when the record is updated. Checks to see if the updated_at attribute is in the database is later than the one hidden(defined on your page). By the way you should define the following in the your app/view/votes/_form.html.erb

<%= f.hidden_field :original_updated_at %>

If a there is a conflict then raise the validation error.

And if you are using Rails 4 you will won't have the attr_accessible and will need to add :original_updated_at to your vote_params method in your controller.

Hopefully this sheds some light.

Deej
  • 5,334
  • 12
  • 44
  • 68
  • This is very clever, but what if this is run in a background process, it is loaded, a 2nd process runs and updates the DB, this first process gets to `if updated_at > original_updated_at`? How would the first process know the DB updated_at field has been changed? – Chloe Dec 29 '13 at 02:44
  • Because should the 2nd process run and update the DB it will then update the value of the `original_updated_at` which **remember** is a hidden value. So what happens is every time a row is updated the value of `original_updated_at` changes. So that when an update is submitted whichever process doesn't match that value of `original_updated_at` attribute will be rejected. – Deej Dec 29 '13 at 03:04
  • But the 2nd process is in another Ruby executable VM, and even another machine. The 2nd process won't see the same `original_updated_at` hidden value which is only in its memory space. – Chloe Dec 29 '13 at 03:12
  • 1
    Hmm... I haven't dabbled with queuing to be honest. But hopefully this post and this answer in particular may help: https://groups.google.com/d/msg/pdxruby/e4z0VwzbbsQ/SMF66kvZN4wJ – Deej Dec 29 '13 at 03:26
  • 2
    WARNING: The above code does not actually protect against any concurrency issues. – Michael Johnston May 16 '14 at 03:38
0

For simple +1

Vote.increment_counter :vote, Vote.first.id

Because vote was used both for the table name and the field, this is how the 2 are used

TableName.increment_counter :field_name, id_of_the_row
lurker
  • 56,987
  • 9
  • 69
  • 103
Jeremy
  • 942
  • 1
  • 10
  • 28