Forking, sqlalchemy, and scoped sessions

Question

I'm getting the following error (which I assume is because of the forking in my application), "This result object does not return rows".

Traceback
---------
File "/opt/miniconda/envs/analytical-engine/lib/python2.7/site-packages/dask/async.py", line 263, in execute_task
result = _execute_task(task, data)
File "/opt/miniconda/envs/analytical-engine/lib/python2.7/site-packages/dask/async.py", line 245, in _execute_task
return func(*args2)
File "/opt/miniconda/envs/analytical-engine/lib/python2.7/site-packages/smg/analytics/services/impact_analysis.py", line 140, in _do_impact_analysis_mp
 Correlation.user_id.in_(user_ids)).all())
File "/opt/miniconda/envs/analytical-engine/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2241, in all
return list(self)
File "/opt/miniconda/envs/analytical-engine/lib/python2.7/site-packages/sqlalchemy/orm/loading.py", line 65, in instances
fetch = cursor.fetchall()
File "/opt/miniconda/envs/analytical-engine/lib/python2.7/site-packages/sqlalchemy/engine/result.py", line 752, in fetchall
self.cursor, self.context)
File "/opt/miniconda/envs/analytical-engine/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1027, in _handle_dbapi_exception
util.reraise(*exc_info)
File "/opt/miniconda/envs/analytical-engine/lib/python2.7/site-packages/sqlalchemy/engine/result.py", line 746, in fetchall
l = self.process_rows(self._fetchall_impl())
File "/opt/miniconda/envs/analytical-engine/lib/python2.7/site-packages/sqlalchemy/engine/result.py", line 715, in _fetchall_impl
self._non_result()
File "/opt/miniconda/envs/analytical-engine/lib/python2.7/site-packages/sqlalchemy/engine/result.py", line 720, in _non_result
"This result object does not return rows. "

I'm using dask and it's multiprocessing scheduler (which uses multiprocessing.Pool). As I understand it (based on the documentation), sessions created from a scoped session object (created via scoped_session()), are threadsafe. This is because they are threadlocal. This would lead me to believe that when I call Session() (or using the proxy Session) I'm getting a session object that only exists and is only accessible from the thread it was called from. This seems pretty straight forward.

What I am confused about, is what is why I'm having issues when forking the process. I understand that you can't re-use an engine across processes, so I've followed the event-based solution verbatim from the docs and done this:

class _DB(object):

    _engine = None

    @classmethod
    def _get_engine(cls, force_new=False):
        if cls._engine is None or force_new is True:
            cfg = Config.get_config()
            user = cfg['USER']
            host = cfg['HOST']
            password = cfg['PASSWORD']
            database = cfg['DATABASE']
            engine = create_engine(
                'mysql://{}:{}@{}/{}?local_infile=1&'
                'unix_socket=/var/run/mysqld/mysqld.sock'.
                    format(user, password, host, database),
                pool_size=5, pool_recycle=3600)
            cls._engine = engine
        return cls._engine



# From the docs, handles multiprocessing
@event.listens_for(_DB._get_engine(), "connect")
def connect(dbapi_connection, connection_record):
    connection_record.info['pid'] = os.getpid()

#From the docs, handles multiprocessing
@event.listens_for(_DB._get_engine(), "checkout")
def checkout(dbapi_connection, connection_record, connection_proxy):
    pid = os.getpid()
    if connection_record.info['pid'] != pid:
        connection_record.connection = connection_proxy.connection = None
        raise exc.DisconnectionError(
            "Connection record belongs to pid %s, "
            "attempting to check out in pid %s" %
            (connection_record.info['pid'], pid)
        )


# The following is how I create the scoped session object.

Session = scoped_session(sessionmaker(
    bind=_DB._get_engine(), autocommit=False, autoflush=False))

Base = declarative_base()
Base.query = Session.query_property()

So my assumptions (based on the docs) are the following:

Using a session object created from a scoped session object, it must always give me a threadlocal session (which in my case would just be the main thread of the child process). Although not in the docs I imagine this should apply even if the scoped session object was created in another process.
The threadlocal session will get a connection from the pool via the engine, if the connection was not created within this process it will create a new one (based on the above connection() and checkout() implementations.)

If both of these things were true, then everything should "just work" (AFAICT). That's not the case though.

I managed to get it to work by creating a new scoped session object in each new process, and using it in all subsequent calls using a session.

BTW the Base.query attribute needed to be updated from this new scoped session object as well.

I imagine that my #1 assumption above is incorrect. Can anyone help me understand why I need to create a new scoped session object in each process?

Cheers.

Can you post a minimal example, including the forking code, as well as the full stack trace? I have a suspicion that the connection pool has connected to the DB prior to the fork, leading to both processes sharing the socket. — univerio, Feb 09 '17 at 06:55
I will add some example code. The pool would have definitely already connected before any forking is done, but then child processes using the pool are handled by checking the pid of the calling code before using it or creating a new one (as per the above `checkout` method). Or at least that's the intent AIUI. — Jerry Londergaard, Feb 09 '17 at 12:23
Rather than using `dask.multiprocessing.get` you might create a single node distributed scheduler. This will pre-fork from cleaner processes and is generally a cleaner experience: http://dask.pydata.org/en/latest/scheduler-choice.html — MRocklin, Feb 09 '17 at 14:06
Thanks for the suggestion @MRocklin. I have used the local scheduler before, but was not for this specific block of code. However, after trying it (instead of `dask.multiprocessing.get`) in this particular instance, everything seems to just work. I haven't been through the distributed source code, so I can only guess as to why it is working now and not with `dask.multiprocessing.get` — Jerry Londergaard, Feb 09 '17 at 23:32

score 1 · Answer 1 · answered Feb 10 '17 at 19:58

1

It is not clear when your fork happens but the most common issue is that the engine is created before the fork, which initializes a TCP connections to the database with your pool_size=5 which then gets copied over to the new processes and results in multiple processes interacting with the same physical sockets => troubles.

Options are to:

Disable the pool and use an on demand connection: poolclass=NullPool
Re-create the pool after fork: sqla_engine.dispose()
Delay the create_engine until after the fork

answered Feb 10 '17 at 19:58

TheArchitect

2,161
1
12
16

It's definitely being forked after the engine is created, but AIUI that's the whole point of the custom `checkout()` method; to handle multiple processes using the one pool. I suspect it has something to do with the way dask is forking the process. – Jerry Londergaard Feb 16 '17 at 01:35

Forking, sqlalchemy, and scoped sessions

1 Answers1