2

I'm running some code (simplified, but still breaking version below), which fails in ~1/3000 executions waiting for the first switch. What should happen is:

  • threads[0] starts up and grabs the mutex
  • threads[0] notifies cond_main so the main thread can create thread[1]
  • thread[1] / thread[0] do some work waiting for each others' signals

Unfortunately it fails in thread[0] - the cond.wait ends with a timeout and raises an exception. How would I synchronise this, making sure that cond_main doesn't get signalled too early?

Ideally I'd like to pass a locked mutex from main thread and unlock it in the first spawned thread, but Ruby requires mutexes to be unlocked in the same thread - so this won't work.

Self-contained reproducer (doesn't make much sense on its own, but real work is stripped out):

def run_test
  mutex     = Mutex.new
  cond      = ConditionVariable.new
  cond_main = ConditionVariable.new
  threads   = []

  t1_done = false
  t2_done = false

  threads << Thread.new do
    mutex.synchronize do
      # this needs to happen first
      cond_main.signal
      cond.wait(mutex, 2)
      raise 'timeout waiting for switch' if !t2_done

      # some work
      t1_done = true
      cond.signal
    end
  end
  cond_main.wait(Mutex.new.lock, 2)

  threads << Thread.new do
    mutex.synchronize do
      cond.signal
      # some work
      t2_done = true
      cond.wait(mutex, 2)
      raise 'timeout waiting for switch' if !t1_done
    end
  end

  threads.map(&:join)
end

5000.times { |x|
  puts "Run #{x}"
  run_test
}

Tested on Ruby 2.5.3

viraptor
  • 33,322
  • 10
  • 107
  • 191

1 Answers1

1

Set a while block to stop waiting if second thread finished (see more here):

def run_test
  mutex     = Mutex.new
  cond      = ConditionVariable.new
  cond_main = ConditionVariable.new
  threads   = []

  spawned = false

  t1_done = false
  t2_done = false

  threads << Thread.new do
    mutex.synchronize do
      while(!spawned) do
        cond.wait(mutex, 2)
      end
      raise 'timeout waiting for switch' if !t2_done

      # some work
      t1_done = true
      cond.signal
    end
  end

  threads << Thread.new do
    mutex.synchronize do
      spawned = true
      cond.signal
      # some work
      t2_done = true
      cond.wait(mutex, 2)
      raise 'timeout waiting for switch' if !t1_done
    end
  end

  threads.map(&:join)
end

50000.times { |x| 
  puts x 
  run_test 
}

Alternatively, using a counting semaphore, we can assign some priorities to the threads:

require 'concurrent-ruby'

def run_test
  mutex     = Mutex.new
  sync      = Concurrent::Semaphore.new(0)
  cond      = ConditionVariable.new
  cond_main = ConditionVariable.new
  threads   = []

  t1_done = false
  t2_done = false

  threads << Thread.new do
    mutex.synchronize do
      sync.release(1)
      # this needs to happen first
      cond.wait(mutex, 2)
      raise 'timeout waiting for switch' if !t2_done

      # some work
      t1_done = true
      cond.signal
    end
  end

  threads << Thread.new do
    sync.acquire(1)
    mutex.synchronize do
      cond.signal
      # some work
      t2_done = true
      cond.wait(mutex, 2)
      raise 'timeout waiting for switch' if !t1_done
    end
  end

  threads.map(&:join)
end

50000.times { |x| 
  puts x 
  run_test 
}

I prefer the second solution as it allows you to control the order of your threads, though it feels a little dirtier.

As a curiosity, on Ruby 2.6, your code does not seem to raise exceptions (tested >10M runs).

Marcin Kołodziej
  • 5,253
  • 1
  • 10
  • 17