16

I'm getting an aiohttp client_exception.ServerDisconnectedError whenever I do more than ~200 requests to an API I'm hitting using asyncio & aiohttp. It doesn't seem to be my code because it works consistently with smaller number of requests, but fails on any larger number. Trying to understand if this error is related to aiohttp, or my code, or with the API endpoint itself? There doesn't seem to be much info online around this.

  Traceback (most recent call last):
  File "C:/usr/PycharmProjects/api_framework/api_framework.py", line 27, in <module>
    stuff = abc.do_stuff_2()
  File "C:\usr\PycharmProjects\api_framework\api\abc\abc.py", line 72, in do_stuff
    self.queue_manager(self.do_stuff(json_data))
  File "C:\usr\PycharmProjects\api_framework\api\abc\abc.py", line 115, in queue_manager
    loop.run_until_complete(future)
  File "C:\Python36x64\lib\asyncio\base_events.py", line 466, in run_until_complete
    return future.result()
  File "C:\usr\PycharmProjects\api_framework\api\abc\abc.py", line 96, in do_stuff
    result = await asyncio.gather(*tasks)
  File "C:\usr\PycharmProjects\api_framework\api\abc\abc.py", line 140, in async_post
    async with session.post(self.api_attr.api_endpoint + resource, headers=self.headers, data=data) as response:
  File "C:\Python36x64\lib\site-packages\aiohttp\client.py", line 843, in __aenter__
    self._resp = await self._coro
  File "C:\Python36x64\lib\site-packages\aiohttp\client.py", line 387, in _request
    await resp.start(conn)
  File "C:\Python36x64\lib\site-packages\aiohttp\client_reqrep.py", line 748, in start
    message, payload = await self._protocol.read()
  File "C:\Python36x64\lib\site-packages\aiohttp\streams.py", line 533, in read
    await self._waiter
aiohttp.client_exceptions.ServerDisconnectedError: None

here's some of the code to generate the async requests:

    def some_other_method(self):
        self.queue_manager(self.do_stuff(all_the_tasks))

    def queue_manager(self, method):
        print('starting event queue')
        loop = asyncio.get_event_loop()
        future = asyncio.ensure_future(method)
        loop.run_until_complete(future)
        loop.close()

    async def async_post(self, resource, session, data):
        async with session.post(self.api_attr.api_endpoint + resource, headers=self.headers, data=data) as response:
            resp = await response.read()
        return resp

    async def do_stuff(self, data):
        print('queueing tasks')

        tasks = []
        async with aiohttp.ClientSession() as session:
            for row in data:
                task = asyncio.ensure_future(self.async_post('my_api_endpoint', session, row))
                tasks.append(task)
            result = await asyncio.gather(*tasks)
            self.load_results(result)

Once the tasks have completed, self.load_results() method just parses the json and updates the DB.

hyphen
  • 2,368
  • 5
  • 28
  • 59
  • Is that more than 200 requests one after the other, or in parallel? – user4815162342 Jul 09 '18 at 17:30
  • @user4815162342 - just added some code that will hopefully answer your question. Unless I'm doing something wrong, it should be happening in parallel. – hyphen Jul 10 '18 at 12:05
  • Maybe the server simply can't handle such a large number of parallel requests? I'm by no means an aiohttp expert, but I wouldn't be surprised if `ServerDisconnectedError` meant exactly what it looks like. – user4815162342 Jul 10 '18 at 12:31
  • @user4815162342 - just got a response from the developer and they said they're having an issue with user authentication which is causing these errors. Hopefully they'll get it resolved. – hyphen Jul 10 '18 at 12:45
  • In that case, asyncio is working exactly as it should - raising a business exception that corresponds to the communication error. – user4815162342 Jul 10 '18 at 13:05

3 Answers3

28

It is most likely caused by the configuration of the HTTP server. There are at least two possible reasons for the ServerDisconnectedError:

  1. The server could limit the number of parallel TCP connections that can be made from a single IP address. By default, aiohttp already limits the number of parallel connections to 100. You can try reducing the limit and see if it solve the issue. To do so, you can create a custom TCPConnector with a different limit value and pass it to the ClientSession:
        connector = aiohttp.TCPConnector(limit=50)
        async with aiohttp.ClientSession(connector=connector) as session:
            # Use your session as usual here
  1. The server could limit the duration of a TCP connection. By default, aiohttp uses HTTP keep-alive so that the same TCP connection can be used for multiple requests. This improves performances since a new TCP connection does not have to be made for each request. However, some servers limit the duration of a TCP connection, and if you use the same TCP connection for many requests, the server can close it before you are done with it. You can disable HTTP keep-alive as a workaround. To do so, you can create a custom TCPConnector with the parameter force_close set to True, and pass it to the ClientSession:
        connector = aiohttp.TCPConnector(force_close=True)
        async with aiohttp.ClientSession(connector=connector) as session:
            # Use your session as usual here

I had the same issue and disabling HTTP keep-alive was the solution for me. Hope this helps.

Antoine Hébert
  • 393
  • 5
  • 6
  • 2
    This answer also helped me, but can you explain the relationship between the Semaphore and the TCPConnector limit? If I have a Sempahore(20) and TCPConnector(limit=10) vs if I have Semaphore(10) and TCPConnector(limit=20)? I'm confused at their interplay. – j7skov Jan 20 '23 at 17:57
7

This is most likely the server's API not being happy with multiple requests being done asynchronously. You can limit the amount of concurrent calls with asyncio's semaphores.

In your case I would use it within a context manager as:

async def do_stuff(self, data):
    print('queueing tasks')

    tasks = []
    semaphore = asyncio.Semaphore(200)

    async with semaphore:
        async with aiohttp.ClientSession() as session:
            for row in data:
                task = asyncio.ensure_future(self.async_post('my_api_endpoint', session, row))
                tasks.append(task)
            result = await asyncio.gather(*tasks)
            self.load_results(result)
grafuls
  • 311
  • 5
  • 8
  • 4
    `aiohttp.Semaphore` is now depreciated – Quanta Sep 17 '20 at 09:44
  • 5
    @Quanta `aiohttp.Semaphore` is not deprecated. The `loop` parameter is deprecated because that param is being removed across the board in `aiohttp` See https://docs.python.org/3/library/asyncio-sync.html#asyncio.Semaphore – Brian Aug 04 '21 at 20:09
1

I think it's quite possible the other answers are correct, but there's also one more possibility - it seems aiohttp has at least one currently [June 2021] unfixed race condition in it's streams code:

https://github.com/aio-libs/aiohttp/issues/4581

I see the same issue in my project, and it's rare enough (and server disconnect isn't the only symptom, I sometimes get "payload not complete") it feels more like a race condition. I also saw issues like aiohttp putting a packet of data from one response into a different response.

In the end, I switched to https://www.python-httpx.org - this decreased the number of problems, and let me to eventually that some of the 'payload not complete' error was probably related to timeout for sending a large binary response on the server that was occasionally triggering. In general I found httpx to be more reliable, and it's really good that you can use the same package/APIs to support both sync and async.

JosephH
  • 37,173
  • 19
  • 130
  • 154
  • I am also getting that sometimes, but I think I fixed it with the right TCPConnector(limit=x) and Semaphore(y) tweaking... potentially varies per server I imagine. – j7skov Jan 20 '23 at 17:59