5

Let's say I've got a Ruby class in my Rails project that is setting an instance variable.

class Something
  def self.objects
    @objects ||= begin
      # some logic that builds an array, which is ultimately stored in @objects
    end
  end
end

Is it possible that @objects could be set multiple times? Is it possible that during one request, while executing code between the begin/end above, that this method could be called during a second request? This really comes down to a question of how Rails server instances are forked, I suppose.

Should I instead be using a Mutex or thread synchronization? e.g.:

class Something
  def self.objects
    return @objects if @objects

    Thread.exclusive do
      @objects ||= begin
        # some logic that builds an array, which is ultimately stored in @objects
      end
    end
  end
end
Matt Huggins
  • 81,398
  • 36
  • 149
  • 218

2 Answers2

7

It's possible (and desirable) to run Rails in a multi-threaded mode even in MRI. This can be accomplished by changing a line in production.rb.

config.threadsafe!

In MRI, two threads cannot run code simultaneously, but a context switch can happen at any time. In Rubinius and JRuby, threads can run code simultaneously.

Let's look at the code you showed:

class Something
  def self.objects
    @objects ||= begin
      # some logic that builds an array, which is ultimately stored in @objects
    end
  end
end

The ||= code gets expanded to something like:

class Something
  def self.objects
    @objects || (@objects = begin
      # some logic that builds an array, which is ultimately stored in @objects
    end)
  end
end

This means that there are actually two steps to the process:

  1. look up @objects
  2. If @objects is falsy, set @objects to the results of the begin/end expression

It may be possible for the context to switch between these steps. It is certainly possible for the context to switch in the middle of step 2. This means that you may end up running the block multiple times instead of once. In MRI, this may be acceptable, but it's perfectly straight forward to lock a mutex around the expression, so do it.

class Something
  MUTEX = Mutex.new

  def self.objects
    MUTEX.synchronize do
      @objects ||= begin
        # some logic that builds an array, which is ultimately stored in @objects
      end
    end
  end
end
Yehuda Katz
  • 28,535
  • 12
  • 89
  • 91
  • would it be more efficient to do `def self.objects; @objects || MUTEX.synchronize { @objects ||= [...] }; end` - that way it doesn't try to get a lock if the var has already been set? also, could we use a generic `Thread.synchronize` block? – Seamus Abshere May 08 '12 at 13:11
6

I'll take a stab.

Rails is single-threaded. Successive requests to a Rails application are either queued or handled by separate application instances (read: processes). The value of the class instance variable @objects defined in your Something class exists within scope of the process, not within the scope of any instance of your application.

Therefore a mutex would be unnecessary as you would never encounter the case where two processes are accessing the same resource because the memory spaces of the two processes are entirely separate.

I think this raises another question, is @objects intended to be a shared resource, if so I think it needs to be implemented differently.

Disclaimer: I may be completely off the mark here, in fact I sort of hope I am so I can learn something today :)

Patrick Klingemann
  • 8,884
  • 4
  • 44
  • 51
  • I'm not certain either, but I'm rather sure you are correct. :) – Phrogz May 01 '12 at 04:42
  • 1
    I actually think you're pretty close to spot on. The caveat I find with your answer lies in which interpreter is in use. JRuby, for instance, can potentially have these issues. Because it does have real thread support, each process is not just a fork of the parent. So for MRI, I think you're right on. – Christopher WJ Rueber May 01 '12 at 04:43
  • I should have mentioned that I was mainly focused on MRI with my question. I was basically under the same impression as what you outlined in your answer, but I thought it was worth verifying. `@objects` is not intended to be a shared resource, but just a cache to prevent repeated lookup for a time-consuming process. Thanks! – Matt Huggins May 01 '12 at 11:11