I am currently working on building a load testing platform with Google App Engine. My eventual goal is to simulate 1 million users sending data to another GAE server application every 10 minutes.
My current implementation is using task queues, where each task represents a user or handful of users. My problem is that GAE is throttling my task queues with an enforced rate well bellow my maximum/desired rate. I have tried simply throwing instances at the problem, and while this helps I still end up with an enforced rate well bellow desired.
However, I know that my application is capable of running tasks faster than the enforced rate. I have witnessed my application successfully running 250+ tasks per second for a period of time, only to have the task queue throttled to 60 or 30 tasks per second a minute later.
I am using basic scaling with a cap of 10 instances for now, and I would like to understand this problem more before increasing the instance count, as cost start running up quite quickly with a high instance count.
Does anyone have more information on why I am being throttled like this, and how to get around this throttling? The only documentation/information/answers I can find to this question simply quote the insufficient documentation, which says something like:
"The enforced rate may be decreased when your application returns a 503 HTTP response code, or if there are no instances able to execute a request for an extended period of time."
I am happy to clarify any questions, thank you in advance for your help I have been wrestling with this problem for about a month.