1

Problem

I need to execute HTTP requests and simulate high latency at the same time. I have encountered the Twisted package in Python which includes both an HTTP client and a ThrottlingFactory. The issue I am encountering is that the documentation is not clear for a newcomer and I am having trouble understanding how I could utilize the ThrottlingFactory within API calls using the HTTP client.

I am currently utilizing the following example code to test things out. Nothing has worked so far.

from sys import argv
from pprint import pformat

from twisted.internet.task import react
from twisted.web.client import Agent, readBody
from twisted.web.http_headers import Headers


def cbRequest(response):
    print("Response version:", response.version)
    print("Response code:", response.code)
    print("Response phrase:", response.phrase)
    print("Response headers:")
    print(pformat(list(response.headers.getAllRawHeaders())))
    d = readBody(response)
    d.addCallback(cbBody)
    return d


def cbBody(body):
    print("Response body:")
    print(body)


def main(reactor, url=b"http://httpbin.org/get"):
    agent = Agent(reactor)
    d = agent.request(
        b"GET", url, Headers({"User-Agent": ["Twisted Web Client Example"]}), None
    )
    d.addCallback(cbRequest)
    return d


react(main, argv[1:])

How can I use the ThrottlingFactory in this example?

Joe
  • 85
  • 12

1 Answers1

0

You're right - this composition is awkward, and it should be better documented, and arguably have a nicer API!

Still, you can accomplish this by putting a proxy between your application and the reactor.

It would look like this:

from sys import argv
from pprint import pformat
from dataclasses import dataclass


from twisted.internet.task import react
from twisted.internet.interfaces import IReactorTCP
from twisted.web.client import Agent, readBody
from twisted.web.http_headers import Headers
from twisted.protocols.policies import ThrottlingFactory


def cbRequest(response):
    print("Response version:", response.version)
    print("Response code:", response.code)
    print("Response phrase:", response.phrase)
    print("Response headers:")
    print(pformat(list(response.headers.getAllRawHeaders())))
    d = readBody(response)
    d.addCallback(cbBody)
    return d


def cbBody(body):
    print("Response body:")
    print(len(body))


@dataclass
class SlowReactorProxy:
    original: IReactorTCP

    def __getattr__(self, name):
        return getattr(self.original, name)

    def connectTCP(self, host, port, factory, timeout=30, bindAddress=None):
        return self.original.connectTCP(
            host, port, ThrottlingFactory(factory, readLimit=0.1), timeout, bindAddress
        )


def main(reactor, url=b"http://httpbin.org/bytes/10485760000"):
    agent = Agent(SlowReactorProxy(reactor))
    d = agent.request(
        b"GET", url, Headers({"User-Agent": ["Twisted Web Client Example"]}), None
    )
    d.addCallback(cbRequest)
    return d


react(main, argv[1:])

However, unfortunately, ThrottlingFactory's algorithm for throttling traffic is quite primitive; there's just a timer that fires once per second and pauses everyone if too much data has been consumed. This means that you will be reading at maximum speed with zero throttling for an entire second at a time, then, having exhausted that quota, pause for a commensurately long period of time. On my (gigabit) network, I cannot get a large enough entity-body out of httpbin (the max size seems to be 102400) in order to be producing data for longer than a second, so no throttling will ever take place in this scenario.

Hopefully this will help you accomplish your task, but I'd encourage you to file a bug on twisted in order to make the composition of HTTP and throttling somewhat more graceful and responsive.

Glyph
  • 31,152
  • 11
  • 87
  • 129