1

I'm running a Django project with Peewee in Python 3.6 and trying to track down what's wrong with the connection pooling. I keep getting the following error on the development server (for some reason I never experience this issue on my local machine):

Lost connection to MySQL server during query

The repro steps are reliable and are:

  1. Restart Apache on the instance.
  2. Go to my Django page and press a button which triggers a DB operation.
  3. Works fine.
  4. Wait exactly 10 minutes (I've tested enough to get the exact number).
  5. Press another button to trigger another DB operation.
  6. Get the lost connection error above.

The code is structured such that I have all the DB operations inside an independent Python module which is imported into the Django module.

In the main class constructor I'm setting up the DB as such:

from playhouse.pool import PooledMySQLDatabase

def __init__(self, host, database, user, password, stale_timeout=300):
    self.mysql_db = PooledMySQLDatabase(host=host, database=database, user=user, password=password, stale_timeout=stale_timeout)
    db_proxy.initialize(self.mysql_db)

Every call which needs to make calls out to the DB are done like this:

def get_user_by_id(self, user_id):
    db_proxy.connect(reuse_if_open=True)
    user = (User.get(User.user_id == user_id))
    db_proxy.close()
    return {'id': user.user_id, 'first_name': user.first_name, 'last_name': user.last_name, 'email': user.email }

I looked at the wait_timeout value on the MySQL instance and its value is 3600 so that doesn't seem to be the issue (and I tried changing it anyway just to see).

Any ideas on what I could be doing wrong here?

Update:

I found that the /etc/my.cnf configuration file for MySQL has the wait-timeout value set to 600, which matches what I'm experiencing. I don't know why this value doesn't show when I runSHOW VARIABLES LIKE 'wait_timeout'; on the MySQL DB (that returns 3600) but it does seem likely the issue is coming from the wait timeout.

Given this I tried setting the stale timeout to 60, assuming that if it's less than the wait timeout it might fix the issue but it didn't make a difference.

Nick Gotch
  • 9,167
  • 14
  • 70
  • 97

2 Answers2

1

You need to be sure you're recycling the connections properly -- that means that when a request begins you open a connection and when the response is delivered you close the connection. The pool is not recycling the conn most likely because you're never putting it back in the pool, so it looks like its still "in use". This can easily be done with middleware and is described here:

http://docs.peewee-orm.com/en/latest/peewee/database.html#django

coleifer
  • 24,887
  • 6
  • 60
  • 75
  • Thanks. So the good is this cleaned up the code because it moved the connect/close out of the methods and into a single place (the middleware). That's helpful but it didn't seem to fix the issue. Could it be that I'm using `peewee.Proxy` for the `PooledMySQLDatabase`? – Nick Gotch Nov 21 '18 at 17:06
  • Removed the proxy and still getting the issue so that wasn't it either. – Nick Gotch Nov 21 '18 at 19:41
  • Try setting a stale_timeout when instantiating the database object, e.g. 60 or 300. – coleifer Nov 22 '18 at 04:52
  • I tried this but unfortunately continued to see the error. I finally found a solution which worked and posted it here. Thank you @coleifer though, your advice was helpful. – Nick Gotch Nov 29 '18 at 16:01
0

I finally came up with a fix which works for my case, after trying numerous ideas. It's not ideal but it works. This post on Connection pooling pointed me in the right direction.

I created a Django middleware class and configured it to be the first in the list of Django middleware.

from peewee import OperationalError
from playhouse.pool import PooledMySQLDatabase

database = PooledMySQLDatabase(None)

class PeeweeConnectionMiddleware(object):

    CONN_FAILURE_CODES = [ 2006, 2013, ]

    def __init__(self, get_response):
        self.get_response = get_response

    def __call__(self, request):
        if database.database: # Is DB initialized?
            response = None
            try:
                database.connect(reuse_if_open=True)
                with database.atomic() as transaction:
                    try:
                        response = self.get_response(request)
                    except:
                        transaction.rollback()
                        raise
            except OperationalError as exception:
                if exception.args[0] in self.CONN_FAILURE_CODES:
                    database.close_all()
                    database.connect()
                    response = None
                    with database.atomic() as transaction:
                        try:
                            response = self.get_response(request)
                        except:
                            transaction.rollback()
                            raise
                else:
                    raise
            finally:
                if not database.is_closed():
                    database.close()
            return response
        else:
            return self.get_response(request)
Nick Gotch
  • 9,167
  • 14
  • 70
  • 97
  • For any future people to run across this answer, after 1.5 years of running this code we only just realized the innermost try-except rollback isn't actually happening. `self.get_response(request)` returns a `response` with a `status_code` of 500 but **doesn't** raise any inner exception. The `OperationalError` does work but not the rollbacks. If you want the rollbacks to work you need to look for `response.status_code == 500` and handle it that way. – Nick Gotch Jun 09 '20 at 23:21