4

I'm using the python module ratelimit to throttle a function, which calls a rest api, I need to apply throttle based on the method of the requests, e.g. for PUT/POST/DELETE 1 per 10s, for GET 5 per 1s, how can I achieve this without breaking the function into two?

from ratelimit import limits, sleep_and_retry

@sleep_and_retry
@limits(calls=1 if method != 'GET' else 5, period=10 if method != 'GET' else 1)
def callrest(method, url, data):
    ...

Is it possible to do this?

0x263A
  • 1,807
  • 9
  • 22
fluter
  • 13,238
  • 8
  • 62
  • 100

3 Answers3

7

You don't have to re-invent the wheel by creating your own rate limiter when ratelimit already works well.

To apply different rate limits based on the method argument passed in, make a decorator that creates two ratelimit.limits-decorated functions from the given function--one decorated with arguments needed by the GET method, and the other decorated with those needed by non-GET methods. Then make the decorator return a wrapper function that calls one of the two decorated functions above according to the value of the method argument:

from ratelimit import limits, sleep_and_retry

def limits_by_method(func):
    def wrapper(method, *args, **kwargs):
        return (get if method == 'GET' else non_get)(method, *args, **kwargs)
    get = limits(calls=5, period=1)(func)
    non_get = limits(calls=1, period=10)(func)
    return wrapper

@sleep_and_retry
@limits_by_method
def callrest(method, url, data):
    ...

Demo: https://replit.com/@blhsing/ExoticWretchedAdvance

blhsing
  • 91,368
  • 6
  • 71
  • 106
  • This is the answer, but I would swap the `def wrapper()` definition with the limits declaration, so it will be `get = .... , non_get = ..., def wrapper(..)..., return wrapper` – foobarna Apr 02 '23 at 11:11
3

Here is a "rate limiter" that can be used without requiring the use of a decorator and therefore you can just create two separate rate limiters and use on or the other depending on the method being called. An instance of this class can be used across multiple threads since internally it uses a lock to keep its internal state consistent. You can also use a managed version of this class to be used across multiple processes.

First the class:

from multiprocessing.managers import BaseManager
from collections import deque
from threading import Lock
import time

class RateLimiter:
    def __init__(self, call_count, period=1.0):
        self._call_count = int(call_count)
        self._period = float(period)
        self._called_timestamps = deque()
        self._lock = Lock()

    def throttle(self):
        with self._lock:
            while True:
                now = time.monotonic()
                while self._called_timestamps:
                    time_left = self._called_timestamps[0] + self._period - now
                    if time_left >= 0:
                        break
                    self._called_timestamps.popleft()
                if len(self._called_timestamps) < self._call_count:
                    break
                time.sleep(time_left)
            self._called_timestamps.append(now)

# A "managed" RateLimiter is required for use with multiprocessing:
class RateLimiterManager(BaseManager):
    pass

RateLimiterManager.register('RateLimiter', RateLimiter)

And then your code would be modified as follows:

get_rate_limiter = RateLimiter(5, 1.0)
put_post_delete_rate_limiter = RateLimiter(1, 10.0)

def callrest(method, url, data):
    rate_limiter = get_rate_limiter if method == 'GET' else put_post_delete_rate_limiter
    rate_limiter.throttle()
    ...

General Note About Rate Limiters

Suppose you have a function foo that calls some web service but cannot exceed 2 calls per second just as an example and you were able to use the PYPI ratelimit package. Then your code would look something like the following:

@sleep_and_retry
@limits(calls=2, period=1)
def foo():
    do_some_calculations()
    call_web_service()
    do_some_more_calculations()

The rate limiter will (at least it should) ensure that foo cannot be called more than twice in any two-second interval. But invoking foo is not the same thing as calling the web service since some time will elapse between the invocation of foo and the actual calling of the web service. The problem is that this elapsed time is, in principle, variable. This means that we can't be 100% sure that the actual web service is not being invoked a third time in some given 2-second window.

Now the web service may very well have built into its rules some degree of tolerance to handle this possibility. If not, it seems to me that one would want to err on the side of caution and perhaps use a slightly larger time interval. For example we might want to use:

@sleep_and_retry
@limits(calls=2, period=1.1)
def foo():
    do_some_calculations()
    call_web_service()
    do_some_more_calculations()

This is why I consider it is preferable to put the throttling closer to the actual web service call (as is possible with my solution) and not use a decorator on the enclosing function since this should reduce the variability in timing to some degree.

I welcome any comments on this issue.

Example Using Multprocessing

Here is how you would use the RateLimiter class if callrest were a multiprocessing worker function:

from collections import deque
from threading import Lock
import time

class RateLimiter:
    ... # class code omitted for brevity

def init_pool_processes(*args):
    global get_rate_limiter
    global put_post_delete_rate_limiter

    get_rate_limiter, put_post_delete_rate_limiter = args

def callrest(method, url, data):
    rate_limiter = get_rate_limiter if method == 'GET' else put_post_delete_rate_limiter
    rate_limiter.throttle()
    return time.time()

# A "managed" RateLimiter is required for use with multiprocessing:
from multiprocessing.managers import BaseManager

class RateLimiterManager(BaseManager):
    pass

if __name__ == '__main__':
    from multiprocessing import Pool

    RateLimiterManager.register('RateLimiter', RateLimiter)
    with RateLimiterManager() as manager:
        get_rate_limiter = manager.RateLimiter(5, 1.0)
        put_post_delete_rate_limiter = manager.RateLimiter(1, 10.0)
        pool = Pool(10, initializer=init_pool_processes, initargs=(get_rate_limiter, put_post_delete_rate_limiter))
        results = [pool.apply_async(callrest, args=('GET', None, None)) for _ in range(10)]
        for result in results:
            print(result.get())
        pool.close()
        pool.join()

Prints:

1680525645.944291
1680525645.9452908
1680525645.9452908
1680525645.9452908
1680525645.9452908
1680525646.960845
1680525646.960845
1680525646.9618495
1680525646.9648433
1680525646.9658465
Booboo
  • 38,656
  • 3
  • 37
  • 60
1

You can apply rate limiting based on the method parameter. Example:

from ratelimit import limits, sleep_and_retry

def method_rate_limits(calls, period):
    def decorator(func):
        @sleep_and_retry
        @limits(calls=calls, period=period)
        def wrapper(*args, **kwargs):
            return func(*args, **kwargs)
        
        return wrapper
    return decorator

@method_rate_limits(calls=1, period=10)
def callrest(method, url, data):
    ...

@method_rate_limits(calls=5, period=1)
def callrest_get(url, data):
    ...
Dmitrii Malygin
  • 144
  • 2
  • 2
  • 12