Requirement is to rate limit the outgoing http calls to an external service based on a unique client identifier in highly scalable distributed microservice architecture.
Explaination
Suppose my service name is penguin which is calling external service raven. There are multiple instances of penguin service running. Only 1 instance of raven is running. Http request body from penguin to raven contains a parameter named clientId.
List of clientId is dynamic and can grow in future.
penguin is called by an upstream service, say falcon
So flow is falcon -> penguin -> raven
There is no rate limit between falcon and penguin
Now raven can accept only one request in 5 seconds which means any two http request with same clientId must have a gap of 5 seconds. If there are two requests within 5 seconds time window, penguin must have to wait to hit second request to respect 5 seconds time window for same clientId.
For example, request from penguin R1 with clientId xyz has been called to raven at 12:05 pm, R2 comes with same clientId xyz at 12:08 pm. Penguin must have to queue R2 and can only call raven on or after 12:10 pm. We can not drop R2, it must wait.
Looking for an appropriate design solution for this problem in the context of highly scalable microservice world.