10

I cannot understand the reason aiohttp (and asyncio in general) server implementation does not provide a way to limit max concurrent connections limit (number of accepted sockets, or number of running requests handlers). (https://github.com/aio-libs/aiohttp/issues/675). Without this limit, it is easy to run out of memory and/or file descriptors.

In the same time, aiohttp client by default limits number of concurrent requests to 100 (https://docs.aiohttp.org/en/stable/client_advanced.html#limiting-connection-pool-size), aiojobs limits number of running tasks and size of pending tasks list, nginx has worker_connections limit, any sync framework is limited by number of worker threads by design.

While aiohttp can handle a lot of concurrent requests, this number is still limited. Docs on aiojobs says "The Scheduler has implied limit for amount of concurrent jobs (100 by default). ... It prevents a program over-flooding by running a billion of jobs at the same time". And still, we can happily spawn "billion" (well, until we run out of resources) aiohttp handlers.

So the question is, why is it implemented the way it is? Am I missing some important detail? I think we can somehow pause requests handlers using Semafor, but the socket is still accepted by aiohttp and coroutine is spawned, in contrast with nginx. Also when deploying behind nginx, the number of worker_connections and aiohttp desired limit will certainly be different.(because nginx may serve static files also)

Alexandr Tatarinov
  • 3,946
  • 1
  • 15
  • 30

2 Answers2

7

Based on the developers' comments on the linked issue, the reasons for this choice are the following:

  • The application can return a 4xx or 5xx response if it detects that the number of connections is larger than what it can reasonably handle. (This differs from the Semaphore idiom, which would effectively queue the connection.)

  • Throttling the number of server connections is more complicated than just specifying a number, because the limit might well depend on what your coroutines are doing, i.e. it should at least be path-based. Andrew Svetlov links to NGINX documentation about connection limiting to support this.

  • It is anyway recommended to put aiohttp behind a specialized front server such as NGINX.

More detail than this can only be provided by the developer(s), who have been known to read this tag.

At this point, it appears that the recommended solution is to either use a reverse proxy for limiting, or an application-based limit like this decorator (untested):

REQUEST_LIMIT = 100

def throttle_handle(real_handle):
    _nrequests = 0
    async def handle(request):
        nonlocal _nrequests
        if _nrequests >= REQUEST_LIMIT:
            return aiohttp.web.Response(
                status=429, text="Too many connections")
        _nrequests += 1
        try:
            return await real_handle(request)
        finally:
            _nrequests -= 1
    return handle

@throttle_handle
async def handle(request):
    ... your handler here ...
user4815162342
  • 141,790
  • 18
  • 296
  • 355
  • Thank you for your answer! Sure, I have read the Svetlov's points, but while I agree that throttling rules can get complicated for application, we still need some protection against overflooding. If it is not an issue, why then other libraries (some of which are developed by the same author(s)) do provide limit capabilities (just a number)? – Alexandr Tatarinov May 27 '19 at 07:24
  • @AlexandrTatarinov That's a good question. I suppose one could say that for aiohttp client the session object is under control of the application, so the application that needed different limits for different uses could just use different sessions. Also, AFAIK the client connection pool limit is queuing the requests over the limit rather than dropping them, as is requested for the server, so it's "safer" in a way. But these are just guesses, I'm not a developer on the project - and I can't tell if there's a similar logic for aiojobs. – user4815162342 May 27 '19 at 08:59
  • In either case, the StackOverflow q&a format is based on externally verifiable sources, so that's what the answer provides. To make it more practical, I also added a concrete implementation of the "roll your own" variant, which I would hope makes it as a "recipe" in the official docs. – user4815162342 May 27 '19 at 08:59
2

To limit concurrent connections you can use aiohttp.TCPConnector or aiohttp.ProxyConnector if you using proxy. Just create it in a session instead of using the default.

aiohttp.ClientSession(
    connector=aiohttp.TCPConnector(limit=1)
)
aiohttp.ClientSession(
    connector=aiohttp.ProxyConnector.from_url(proxy_url, limit=1)
)
Quietude
  • 21
  • 1
  • 1
    The question is about limiting concurrent connections on the _server_ implementation. Your answer is about limits on the client session, which the OP is aware of, and mentions it in the question. – user4815162342 Aug 19 '21 at 17:54