We use Google Cloud Run to wrap an analysis developed in R behind a web API. For this, we have a small Fastify app that launches an R script and uploads the results to Google Cloud Storage. The process' stdout and stderr are written to a file and are also uploaded at the end of the analysis.
However, we sometimes run into issues when a process takes longer to execute than expected. In these cases, we fail to upload anything and it's difficult to debug, because stdout and stderr are "lost" on the instance. The only thing we see in the Cloud Run logs is this message
The request has been terminated because it has reached the maximum request timeout
Is there a recommended way to handle a request timeout?
In App Engine there used to be a descriptive error: DeadllineExceededError for Python and DeadlineExceededException for Java.
We currently evaluate the following approach
- Explicitly set Cloud Run's request timeout
- Provide the same value as an environment variable, so it's available to the container
- When receiving a request, we start a timer that calls a "cleanup" function just before the timeout is exceeded
- The cleanup function stops the running analysis and uploads the current stdout and stderr files to Cloud Storage
This feels a little complicated so any feedback very appreciated.