The 429 and 500 should be the same pending queue timeout issue when max clone is set to a low value, and rapid scaling cannot expand appserver pool when a spike of requests come in. In this case a small part of requests can loop over all available appservers, and then be appended for 10seconds at the last attempt, and aborted, and reply with 429/500.
Both 429 and 500 here are aborted requests after pending queue timeout.when a request is aborted, the system will check if current unassigned_max_clones
is 0 or not. If it is 0, 429 is returned; otherwise, 500 is returned .
500s are essentially the same as original 429s, some are aborted by pending queue timeout, after that some are dropped by queue discipline. (whether the code is 429 or 500 currently depends on a number in app's serving state, which may be confusing)
You can also check this cloud run troubleshooting documentation
The request was aborted because there was no available instance
is either HTTP 429 or HTTP 500
- You will receive HTTP 429: No available container instances
When the resource is unable to scale due to the user-configured max_instances
- You will receive HTTP 500: Cloud Run couldn't manage the rate of traffic
When the resource is unable to scale intrinsically due to the traffic