Problem
Sometime overnight, my service began throwing 504 errors on longer running (30+ second) requests despite no recent changes in architecture.
Setup
- GCP Cloud Run configured with 3600 second timeout
- GCP Load Balancer with serverless NEG pointing to Cloud Run
Troubleshooting
If I hit the Cloud Run-generated URL directly, I can successfully execute 30+ second requests. If I instead hit the public URL (and thus the load balancer), I get timeouts on longer requests.
Musings
Looking at Google Cloud release notes, there was a change to Load Balancing, but nothing related to timeouts: https://cloud.google.com/release-notes
The current setup has been working flawlessly for over a year.
Update
Modifying the backend timeout is disabled for serverless NEGs, see below:
Update 2
This seems like a bug with GCP load balancing introduced during the last update, as the default timeout should be 60 minutes, not 30 seconds, as per the documentation: