I've recently written a client/server application using python-socketio with aiohttp, I've based my application on async namespaces (server-side), additionally I have many await calls in my on_message events, therefor I must use async locks to make sure I maintain the flow I desire. To achieve this behavior I've written a decorator and wrapped every critical-section type function with it.
@async_synchronized('_async_mutex')
async def on_connect(self, sid, environ):
self._logger.info("client with sid: {} connected to namespace: {}!".format(
sid, __class__.__name__))
important_member = 1
await other_class.cool_coroutine()
important_member = 2
And in my constructor I've initialized _async_mutex = asyncio.Lock()
The decorator:
def async_synchronized(tlockname):
"""A decorator to place an instance based lock around a method """
def _synched(func):
@wraps(func)
async def _synchronizer(self, *args, **kwargs):
tlock = self.__getattribute__(tlockname)
try:
async with tlock:
return await func(self, *args, **kwargs)
finally:
pass
return _synchronizer
return _synched
Now everything works perfectly fine in any normal-use case (closing/opening the client triggers the functions correctly and the locks perform as expected). It's important to note that my on_disconnect function is wrapped with the exact same decorator and lock. The problem I encounter is when a client's network adapter is physically disconnected (normal client closure works just fine), I see that my on_disconnect event is indeed called but another co-routine is currently holding the lock. For some reason the event is triggered multiple times and eventually gets deadlocked.
I've wrapped my decorator with prints that describe the lock's status / calling function and also added a try/catch around every async call. It seems that all of my co-routines catch a cancelled exception (I presume by aiohttp), and therefor a method that "held" the lock was cancelled and the lock is never released. I've tried wrapping every async call with an asyncio.shield() but the behavior didn't change.
Is there a different approach to async locks that I should take here? (removing the locks entirely fixes the problem but may cause undefined behavior in the computational part of the application)
More code samples: The actual on_connect and on_disconnect events:
@async_synchronized('_async_mutex')
async def on_connect(self, sid, environ):
self._logger.info("very good log message")
self._connected_clients_count += 1
@async_synchronized('_async_mutex')
async def on_disconnect(self, sid):
self._logger.info("very good disconnect message")
self._connected_clients_count -= 1
await self._another_namespace_class.inform_client_disconnect(sid) # this method is wrapped with the same decorator but with a different lock
Note: the other does not have the same client connected to it. Also, when a network disconnect occurs I don't see the log messages appear as well (I've set the log level to debug)