Note: this is about adaptive scaling SQS poller/subscriber, but I tried to present it in slightly more abstract terms, just a backpressure managment and a mediator.
In code, I skipped all lock, critical sections, semaphores, and try/finally blocks for clarity, but I'm fully aware of those.
Imagine I have an internal queue and a consumer and a poller, which from perspective of this queue is a producer but, it is really a mediator between consumer and some external source (mentioned SQS).
# smaller internal queue is better,
# as all those items in queue are in kind of limbo
queue = Queue(32)
def consume():
while True:
# this may block if queue is empty
item = queue.dequeue()
consume(item)
def poll():
running += 1
while True:
if running > expected:
break
items = poll_batch()
for item in items:
# this may block if queue is full
queue.enqueue(item)
running -= 1
Now, poll_batch
has natural limitation, it always return up 10 items, and has network related overhead. So if request roundtrip takes 100ms, there is a natural maximum limit of 100 items per second, and there is no way around it. To address that problem, I can create additional poll loops (if it goes too slow) or stop existing ones (when it goes too fast).
For example, I can imagine a supervisor
:
def supervisor():
while True:
sleep(1)
decision = do_i_need_more_or_less() # -1, 0, 1
expected = running + decision
if expected > running:
new_thread(poll)
So now, the question is:
how to implement need_more_or_less()
and what date it may have and what date it needs?
Theoretically it is simple, we need to balance consumption rate with production rate, but when I get into actual implementation there are few problems:
Actual consumption rate is never higher than production rate as it will not consume more that it produced (ok, there is buffer between them, so it can, but this internal queue is intentionally small, I would even say queue of length 1 would be ideal). So we need "potential consumption rate" (not "actual") which we can measure only if we ramp up production rate above potential limit, but then internal queue will fill up and all pollers will be blocked. But how to detect such blocking? How to balance if 3 pollers where blocker, 2 poller were not? What does it mean? Does it matter if one was blocked with 1 item waiting and the other with 7 items waiting? Probably. What about ther other way around (when one had 1 item headroom and the other has 5 items headroom)?
I can say that with quite large internal queue it might be easy: having internal queue limit set to 100000 items, I can easily spot it is going down (I can increase polling rate) or going up (polling rate is too high), but with a queue of size 32 it actually does not work, it changes too quickly, and jumps up and down.