14

I'm running into issues with Sidekiq workers.

ActiveRecord::ConnectionTimeoutError: could not obtain a database connection within 5.000 seconds (waited 5.000 seconds)

I am following recommendations about using ActiveRecord::ConnectionTimeoutError and a suitably large connection pool.

I want to find out if I'm exhausting the connection pool. I'm logging size and connections.length from ActiveRecord::Base.connection_pool, but they stay at a constant size = 100 connections.length = 5. Which suggests that this isn't a resource leak issue.

My MySQL server is configured to allow up to 400 concurrent connections.

My Job ended up looking like this:

class MyJob < ActiveJob::Base
  queue_as :default    
  rescue_from StandardError do |exception|
    # clear connections on exception. Not sure if this is a good idea or not.
    ActiveRecord::Base.clear_active_connections!    
  end

  def perform()
    logger.info "size"
    logger.info ActiveRecord::Base.connection_pool.instance_eval { @size }
    logger.info  "connections"
    logger.info ActiveRecord::Base.connection_pool.instance_eval { @connections }.length

    # Ensure connections come from connection pool.
    ActiveRecord::Base.connection_pool.with_connection do |conn|
      # do stuff
    end
  end
end

Is this the right way to diagnose what's causing this, whether it's resource starvation or leakage? Are there other techniques I can use to work out why this is happening?

Joe
  • 46,419
  • 33
  • 155
  • 245

2 Answers2

13

Try ActiveRecord::ConnectionAdapters::ConnectionPool#stat

ActiveRecord::Base.connection_pool.stat 
# => { size: 15, connections: 1, busy: 1, dead: 0, idle: 0, waiting: 0, checkout_timeout: 5 }

From connection_adapters/abstract/connection_pool.rb, in activerecord 5.2.2.1.

Jared Beck
  • 16,796
  • 9
  • 72
  • 97
7

This ActiveRecord::ConnectionTimeoutError can in my opinion occur in only one scenario - when there are so many threads wanting to use DB connections that the pool is exhausted and even waiting for a free connection does not help (as learned from the source code).

In your case, it is weird. You use only 25 worker threads but the pool is set to 100 connections, so there is plenty of reserve. I still suspect that you must have threads spawning somewhere. Perhaps you do some threading in your jobs? Perhaps you use a gem that creates threads in your jobs?

Anyway, if you are able to reproduce the exception, I would suggest to catch it and obtain the listing of all threads at the moment it occurs, something like this:

begin
  # job stuff...      
rescue ActiveRecord::ConnectionTimeoutError
  puts "listing #{Thread.list.count} threads:"
  Thread.list.each_with_index do |t,i| 
    puts "---- thread #{i}: #{t.inspect}"
    puts t.backtrace.take(5)  
  end
end

I expect there will be 100 or more threads and you should see where exactly they are stuck from the backtrace.

Matouš Borák
  • 15,606
  • 1
  • 42
  • 53