9

I have a rails app (4.1.5) backed by a postgres database (9.2.13) using the pg gem (0.17.1), unicorn server (1.1.0) with 2 worker processes.

The rails app runs jobs using sidekiq (2.17.7)

At some point the postgres db went into recovery mode. The following error was thrown by multiple jobs:

PG::ConnectionBad:FATAL:  the database system is in recovery mode
FATAL:  the database system is in recovery mode

The database came back up but jobs continued to throw the following two errors:

PG::Error:invalid encoding name: utf8
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/postgresql_adapter.rb:908:in `set_client_encoding'
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/postgresql_adapter.rb:908:in `configure_connection'
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/postgresql_adapter.rb:603:in `reconnect!'
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract_adapter.rb:313:in `verify!'
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:453:in `block in checkout_and_verify'
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activesupport-4.1.5/lib/active_support/callbacks.rb:82:in `run_callbacks'

and

ActiveRecord::ConnectionTimeoutError:could not obtain a database connection within 5.000 seconds (waited 5.000 seconds)
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:190:in `block in wait_poll'
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:181:in `loop'
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:181:in `wait_poll'
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:136:in `block in poll'
/home/ruby/.rvm/rubies/ruby-2.2.2/lib/ruby/2.2.0/monitor.rb:211:in `mon_synchronize'
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:146:in `synchronize'
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:134:in `poll'

It looks to me like rails notices the connection isn't active and tries to reset the connection. In active record's postgresql_adapter.rb the following method is getting called:

  # Close then reopen the connection.
  def reconnect!
    super
    @connection.reset
    configure_connection
  end

my guess is that the connection.reset is not actually working, so when the pg gem goes to set the encoding (the first part of the configure_connection method) it masks the fact that there is no connection by throwing the encoding specific error.

The following is a method from the pg gem (.17.1) ext/pg_connection.c/2804

/*
* call-seq:


*    conn.set_client_encoding( encoding )
 *
 * Sets the client encoding to the _encoding_ String.
 */
static VALUE
pgconn_set_client_encoding(VALUE self, VALUE str)
{
    PGconn *conn = pg_get_pgconn( self );

    Check_Type(str, T_STRING);

    if ( (PQsetClientEncoding(conn, StringValuePtr(str))) == -1 ) {
        rb_raise(rb_ePGerror, "invalid encoding name: %s",StringValuePtr(str));
    }

    return Qnil;
}

So if those guesses are correct, why isn't the connection reset working?
Restarting the app re-establishes the connection to the database correctly, but I'm hoping for a non-manual solution to this problem.

Knut Knutson
  • 163
  • 2
  • 8
  • Did you tried upgrading `pg` gem? – Vasfed Dec 29 '15 at 15:51
  • you can increase the timeout time by `ActiveRecord::Base.connection.execute('set statement_timeout to 10000')` or in database.yml variables: statement_timeout: 1000000 – Rajarshi Das Dec 29 '15 at 15:53
  • @Vasfed I haven't tried upgrading yet. I was going to reserve that as more of a last ditch effort. As far as I can tell the implementation of reset hasn't changed in more recent versions. – Knut Knutson Dec 29 '15 at 16:07
  • 1
    i would recommend opening an issue on the rails issue tracker. especially if you can provide a minimal reproducible scenario. – phoet Dec 29 '15 at 17:38
  • 2
    It doesn't look like a problem with Rails. You should try to figure out why your database switched to recovery mode. – Hoa Jun 24 '16 at 08:36
  • I have the same error. Any news on this one? – Polopollo Jul 12 '16 at 20:47
  • 1
    If you are hosting the postgres server, look at: https://dba.stackexchange.com/questions/61650/postgres-is-stuck-in-recovery-mode. It's a response about postgres databses stucked in recovery mode. Hope it helps. – Alvaro Inckot May 17 '18 at 12:22
  • Seems like we could use some more info. Were you able to connect to the database outside of rails? There are plenty of folks who have a setup like yours - this seems like it probably is not an issue with rails or postgres - probably something else to do with your app or usage. Leaking connections? – kwerle Jun 18 '18 at 02:24
  • recovery mode means it's a PG issue. Try this : https://dba.stackexchange.com/questions/61650/postgres-is-stuck-in-recovery-mode – gdurelle Oct 05 '18 at 14:41
  • This looks like pg gem bug. We experience PostgreSQL disconnection problems a couple times a month and it always recover automatically. I suggest upgrading pg gem, I know this was suggested before but that was 4 years ago. – Fatih Apr 21 '19 at 15:49
  • According to this fix https://bitbucket.org/ged/ruby-pg/commits/a7f98864a7d6 it could be encoding or connection problem. Not the encoding only. – zawhtut Jun 06 '19 at 07:45

0 Answers0