2

I am writing a python client that sends requests to our inhouse servers. These requests can go through different connections and need to be somewhat balanced. So far so good, I have the logic to send different requests to alternating backend connections.

My problem now is: If one of those connections breaks down for any reason there is a retry logic implemented. Now this retry always sends to the connection it was initialised with. It is so deep in the requests library that I do not see a way to implement load balancing here without reimplementing a huge chunk of the requests library.

I tried subclassing HTTPSConnectionPool and HTTPAdapter but any attempt to get the information about alternate possible connections (Like for example a special url scheme "myhttp://host1!host2!.../path") breaks other parts of the requests library.

Am I missing something or is ther currently just no managably easy way to have retry do load balancing?

I am using python 3.7 and requests 2.24.0

Hans
  • 101
  • 4
  • Did you find any solution? – Brijesh Bhatt Sep 24 '21 at 08:37
  • I went with subclassing the HTTPConnection and HTTPSconnection classes. These are then set as ConnectionCls in the HTTPConnectionPool and HTTPSConnectionPool. It is a bit of a dirty hack, but so far worked through all my testing. – Hans Sep 27 '21 at 06:25
  • Wow, can you please provide a sample reference to this? would be very helpful.. Thanks ! – Brijesh Bhatt Sep 28 '21 at 07:42
  • Sorry for the late answer, I had a lot to do at first, then it slipped my mind. See below, With the character limitation it does not fit in here. – Hans Dec 24 '21 at 08:50

1 Answers1

0

Answer to question above,for character limitation. As said, this is a dirty hack solution with no guarantees to work anywhere else than in my tool.

First, to subclass the HTTPConnection:

from itertools import cycle
from urllib3.connection import HTTPConnection

class RoundRobinHttpConnection(HTTPConnection):
"""Subclass to establish a http connection with some round robin reconnects."""

    hosts: cycle

    def __init__(self, *args, **kwargs):
        """
        Initialise the connection from a predefined list.

        This init function is called every time a new connection needs to be established.
        For example when a connection to a new server is requested. ¡That server will not be used!
        Instead we select a server from the predefined list.
        Also when a connection ended for any reason. The next host from     the list will be selected.

        This class will not(!) be reinitialised for standing connections to a host requested from the
        connection pool.
        """
        host = next(self.hosts)
        parsed_url = urlparse(host)
        kwargs['host'] = parsed_url.hostname
        kwargs['port'] = parsed_url.port
        super().__init__(*args, **kwargs)

same for HTTPSConnection.

Then in your session object you have to tell that you want to use your classes:

class MySession(Session);
    def __init(self, server_names: List[str])
        super(SessionWithUrlBase, self).__init__()
        self.server_names = cycle(server_names)

        RoundRobinHttpConnection.hosts = self.server_names
        HTTPConnectionPool.ConnectionCls = RoundRobinHttpConnection
    
    def request(self, method: str, url: str, *args, **kwargs) -> Response:
        #The connection handler chooses it's own server from the server list. So the used server
        #will not be the one given here unless only one server is provided. We still need to roll
        #over the servers to switch between the established connections.
        full_url = f"{next(self.server_names)}/{self.api_version}/{url}"
        return super(MySession, self).request(method, full_url, *args, **kwargs)

same for https. That should be it. I hope I didn't miss any relevant parts.

Hans
  • 101
  • 4