Google App Engine documentation indicates the two following information:
1- App Engine reserves automatic scaling capacity for applications with low latency, where the application responds to requests in less than one second. (https://cloud.google.com/appengine/docs/standard/python3/how-requests-are-handled)
2- App Engine tasks have specific timeouts that depend on the scaling type of the service that's running them. For worker services running in the standard environment:Automatic scaling: task processing must finish in 10 minutes. (https://cloud.google.com/tasks/docs/creating-appengine-handlers)
For compliance with (1), I'm writing APIs that defer any time-consuming processing to a CloudTask task, which allows my API to respond to all client requests in less that a second and benefit from auto-scaling.
But I also have to write the handlers that will perform the time-consuming processing on request of the Cloudtask service. I also want these handlers to be on GAE, and ideally to also get auto-scaling for these.
I do not see any formal declaration that would allow the GAE service to know whether a given part of my app is a client-facing API or a handler for tasks inserted in Cloudtask. Therefore I do not understand how GAE deals with the two requirements I quoted.
I'm worried that my task handler will deter my app performance metrics and as a result prevent auto-scaling.
Should I deploy these handlers in a different project? But then, how do I get auto scaling for these if they take more than 1 second to perform the task?