1

I'm writing a high-throughput (>1000 requests per second) REST service in Rust, using reqwest to asynchronously connect to a load-balanced upstream service. To reduce latency of requests to that upstream service, I'm using a long-lived reqwest::Client with a connection pool.

The challenge is that reqwest's long-lived connections "break" the load-balancing of the upstream service, since - as long as the number of connections in the pool is sufficient - no connections will be established to machines added to that upstream service to increase its capacity. This leads to my service over-utilizing some machines of that upstream service and under-utilizing others (my applications is the main user of that upstream service, so I can't rely on other users of that service to balance my lopsided usage).

Is there any mechanism in reqwest to periodically close and re-establish connections, in order to ensure that they are balanced across all machines of the upstream service as evenly as possible (or any way to manually implement such behavior on top of reqwest)?

In the reqwest documentation, I've seen ConnectionBuilder::timeout() to ensure connections are closed and later re-established when upstream machines are no longer accessible. There's also ConnectionBuilder::pool_idle_timeout() to eventually close connections from that pool if they are idle long enough. But I've found nothing on closing connections - either automatically or explicitly - that are behaving fine.

Dreamer
  • 1,139
  • 9
  • 18
  • 1
    As far as I know, reqwest is a UX layer on top of `hyper::Client` which does the actual pooling, and hyper does not provide any way to interact with its connection pool. Cf e.g. [hyperium/hyper#1253](https://github.com/hyperium/hyper/issues/1253). Tough from your requirements it looks like you could just drop the current client and create a new one every once in a while no? The alternative is to reuse [the very low level primitives of hyper](https://docs.rs/hyper/latest/hyper/client/conn/index.html). – Masklinn Mar 09 '23 at 12:00
  • @Masklinn good idea. It sounds a bit fishy to drop and replace a connection pool periodically, but best I can tell the behavior would be practically indistinguishable from periodically re-establishing individual connections (as long as the old pool is kept around until all requests previously made through it have completed). – Dreamer Mar 09 '23 at 19:58
  • @Masklinn ... except that most connections will be refreshed at the same time, which means that at that time responses will all have a higher delay and thus the CPU (the service is otherwise CPU-bound) will be partially idle. This in turn may confuse the load metrics used to make cluster scaling decisions. I'll have to check if these issues materialize in practice or if they somehow even out. – Dreamer Mar 09 '23 at 20:42
  • Yeah I don't think it's great but I'm not sure there's a better solution short of contributing to or vendoring more flexible connection recycling in Hyper (and then Reqwest). The only other solution I could see short of doing that is what Kevin Reid suggests (or a variant thereof): go through a proxy and have the proxy kill the connections on a TTL or a number of uses or something. – Masklinn Mar 10 '23 at 06:49

2 Answers2

2

To me, this sounds like a problem that should be solved in the upstream service. If it wants to re-distribute load over more machines, it should purposefully close connections so they can be reestablished to the bigger pool. (In particular, if it sees multiple connections originating from apparently the same source, close one of them.)

That way the service will balance its load whatever the client does — it will work well with any conformant HTTP client, not just one tweaked for it in particular.

Kevin Reid
  • 37,492
  • 13
  • 80
  • 108
  • On the contrary, this would _break_ conformant HTTP clients, since they rightfully expect their requests to be _independent_ from each other - while with our proposal a request would fail if a concurrent one from the same client happens to reach the same machine of the upstream service. – Dreamer Mar 09 '23 at 19:45
  • 2
    @Dreamer No, a HTTP server is free to close a connection _after serving a request_ or _while not serving a request_ (such as due to an idle connection timeout) any time it likes. That's not causing any individual request to fail, and it's explicitly part of HTTP: [408](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/408) and `Connection: close`. The client is expected to handle this by resending the request on a new connection if it wishes to. – Kevin Reid Mar 09 '23 at 23:01
0

In the end, I've implemented the desired behavior on top of reqwest::Client:

I've implemented a pool of reqwest::Clients where each Client is stored along with its creation time stamp. When requesting an entry from the pool, the pool will check the age of available entries, will dispose of those that are too old and will return a remaining one (or a new one if none remains).

This isn't perfect, since it effectively reduces each reqwest::Client connection pool to a single connection. But it adds the desired additional behavior, is simple, well encapsulated and seems to work well in practice.

Dreamer
  • 1,139
  • 9
  • 18