1

App working with standard environment app engine, python 3.7 and cloud sql (Mysql)

Checking the logs there are some with very high latencies (more than 4 seconds), when the expected are 800ms. All these logs are accompanied by this message:

"This request caused a new process to be started for your application, and thus caused your application code to be loaded for the first time. This request may thus take longer and use more CPU than a typical request for your application."

I understand that when it refers to a new process it refers to the deployment of a new instance (since I use automatic scaling) however the strange thing is that when comparing these logs with the deployment of instances in some cases it matches but in others it does not.

My question is, how can these latencies be reduced?

The app engine config is:

runtime: python37
env: standard
instance_class: F1
handlers:
  - url: /static/(.*)
    static_files: static/\1
    require_matching_file: false
    upload: static/.*
  - url: /.*
    script: auto
    secure: always
  - url: .*
    script: auto
automatic_scaling:
  min_idle_instances: automatic
  max_idle_instances: automatic
  min_pending_latency: automatic
  max_pending_latency: automatic
network: {}
ozo
  • 883
  • 1
  • 10
  • 18
  • Without seeing detailed Stackdriver logs and the app.yaml, this message means Cold Start for your app. Do you have autoscaling to zero enabled? Update your question with more information. – John Hanley Oct 22 '19 at 17:13
  • @JohnHanley Edited. In the trace I do not see much useful information since the timeline only shows in the graph a bar with the latency >3 seconds and in the info says that HTTP method was POST and HTTP status code was 200 – ozo Oct 22 '19 at 17:35
  • @ozo did you ever get to the bottom of this? – Alex Fox Apr 19 '20 at 17:08

2 Answers2

4

As you note, these slower requests happen whenever app engine needs to start a new instance for your application, as the initial load is slow (these are called "loading requests").

However, App Engine does provide a way to use "warmup" requests -- basically, dummy requests to your application to start instances in advance of when they are actually needed. This can reduce, but not eliminate the user-affecting loading requests.

This can slightly increase your costs, but it should reduce the loading request latency as these dummy requests will be the ones that eat the cost of starting a new instance.

In the python 3.7 runtime, you can add a "warmup" element to the inbound_services directive in app.yaml:

inbound_services:
- warmup

This will send a request to /_ah/warmup where, if you want, you can do any other initialization the instance needs (e.g. starting a DB connection pool).

robsiemb
  • 6,157
  • 7
  • 32
  • 46
1

There are more strategies that may help you decrease your latencies in your application.

You can modify your automatic_scaling options in order to use something that may suit better for your app.

You can manage better your bandwidth by setting the appropriate Cache-Control header on your responses and set reasonable expiration times for static files.

Using public Cache-Control headers in this way will allow proxy servers and your clients' browser to cache responses for the designated period of time.

You can use bigger instance class like F2 in order to avoid horizontal scaling happening so often. As I understood from this issue, your latencies increase mostly while new instances are deployed.

You can, also enable concurrent requests and write your code as asynchronously as you can.

Andrei Tigau
  • 2,010
  • 1
  • 6
  • 17