1

I am trying to convert my Django/Python app from the Google App Engine Standard Environment to the Flexible environment mainly due to the app becoming slow and was constantly hitting a soft memory limit and suggested I upgrade to a larger instance class. I was already at the highest instance class. My problem is when I try to deploy the build is successful but I keep getting an error when Updating service.

"Your deployment has failed to become healthy in the allotted time and therefore was rolled back. If you believe this was an error, try adjusting the 'app_start_timeout_sec' setting in the 'readiness_check' section."

I've tried adjusting the readiness_check section to allow more time but it just takes longer to give me the same error it seems.I've tried googling this issue and adding more memory but the same error still shows. I'm stuck at this point and I'm not sure where else to look.

Here is the app.yaml that successfully deployed on the standard environment

entrypoint: gunicorn -b :$PORT xpotools.wsgi

instance_class : F4_1G

automatic_scaling:
  min_instances: 5
  min_pending_latency: 30ms
  max_pending_latency: 4s
  max_concurrent_requests: 20
  min_idle_instances: 3

inbound_services:
- warmup

handlers:
- url: /static
  static_dir: static/
  secure: always

- url: /.*
  script: auto
  secure: always

- url: /_ah/warmup
  script: auto

Here is the app.yaml that I'm trying to deploy to the flexible environment

env: flex

runtime_config:
    python_version: 3.7

resources:
  cpu: 1
  memory_gb: 6
  disk_size_gb: 20

entrypoint: gunicorn -b :$PORT xpotools.wsgi 

Am I missing something?

Log from gcloud app deploy --versbosity=debug

https://docs.google.com/document/d/1OLyqg5rQ4vJoXH3XyI476F5ImE0q6XlzoPPjCagS5E0/edit?usp=sharing&resourcekey=0-bpn9nVX_-SrLsYAHwtWPKA

  • Could you try to deploy your application with the flag "--verbosity=debug" and share the output please? – drauedo Aug 09 '21 at 11:47
  • Here is the last message from the log. I tried to put the whole log but its far too long for a comment. [DEADLINE_EXCEEDED]: An internal error occurred while processing task /app-engine-flex/flex_await_healthy/flex_await_healthy>2021-08-09T13:49:52.487Z20758.ue.0: Your deployment has failed to become healthy in the allotted time and therefore was rolled back. If you believe this was an error, try adjusting the 'app_start_timeout_sec' setting in the 'readiness_check' section. – Jason Westover Aug 09 '21 at 14:11
  • Could you update your question with the full logs? – drauedo Aug 09 '21 at 14:19
  • I tried adding it in a code block but it was still too long. I added it in a link at the bottom of the post – Jason Westover Aug 09 '21 at 14:41
  • I suggest you to check this [post](https://stackoverflow.com/a/58140716/9330208) which shows a similar problem to yours. In summary what it suggest is to disable Health Checks or customize them. If this doesn't work I think that the best you can do is to open a case in gcp [support](https://cloud.google.com/support-hub). – drauedo Aug 10 '21 at 10:24

1 Answers1

1

You said that your app was exceeding the GAE standard memory.

At what point does your app start using a lot of memory? If your app starts consuming a lot of memory immediately at deployment (even before an HTTP request is received) then that could be the issue.

I don't fully understand the issue, but GAE flex spins up a lot of workers at deployment, and I suspect each of these workers takes up a significant amount of memory, and combined they exceed your memory limits.

Try updating your app so that memory is consumed at a later time, such as after receiving a first HTTP request. That fixed a similar issue for me.

new name
  • 15,861
  • 19
  • 68
  • 114
  • I tried to convert my app to flex because it was already hitting the soft memory limit at the max instance class in standard. How do I figure out when it starts consuming a lot of memory? Where would I look? – Jason Westover Aug 09 '21 at 13:06
  • See https://stackoverflow.com/questions/55228492/spacy-on-gae-standard-second-python-exceeds-memory-of-largest-instance for more details of my situation. It was actually solving a GAE standard issue so it might help you there and also with GAE flex. – new name Aug 09 '21 at 19:24
  • Thank you so much. This solved my problem which meant I didn't even have to convert to flex. – Jason Westover Aug 10 '21 at 19:52