5

I have a deployment of FastAPI 0.81.0 + uvicorn 0.18.3 using Python 3.10.1 on AWS Fargate with an Application Load Balancer. The server runs (as expected) indefinitely in my local Docker, however on AWS the application always shuts down after 1-2 minutes.

This is the uvicorn invocation in Docker:

CMD ["uvicorn", "--host", "0.0.0.0", "--port", "8000", "--log-level", "trace", "app.main:app"]

My FastAPI application looks like this:

from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware

app = FastAPI()

app = FastAPI()
origins = [
    "*"
]

app.add_middleware(
    CORSMiddleware,
    allow_origins=origins,
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

@app.get("/")
async def root():
    return {"Hello": "World"}

It might have something to do with the Load Balancer, since RAM usage of my Fargate service is not too high:

enter image description here

The usual suspect seems to be health checks via TCP instead of HTTP, however AFAIK the health checks are already via HTTP per default in the Fargate task definition or the EC2 target group, respectively.

Here are the logs of my Fargate Task:

2022-09-22 18:43:46 INFO: Finished server process [1]
2022-09-22 18:43:46 INFO: Waiting for application shutdown.
2022-09-22 18:43:46 TRACE: ASGI [1] Receive {'type': 'lifespan.shutdown'}
2022-09-22 18:43:46 TRACE: ASGI [1] Send {'type': 'lifespan.shutdown.complete'}
2022-09-22 18:43:46 TRACE: ASGI [1] Completed
2022-09-22 18:43:46 INFO: Application shutdown complete.
2022-09-22 18:43:45 INFO: Shutting down
2022-09-22 18:43:39 TRACE: 172.31.21.3:16662 - ASGI [7] Send {'type': 'http.response.body', 'body': '<17 bytes>'}
2022-09-22 18:43:39 TRACE: 172.31.21.3:16662 - ASGI [7] Completed
2022-09-22 18:43:39 TRACE: 172.31.21.3:16662 - HTTP connection lost
2022-09-22 18:43:39 INFO: 172.31.21.3:16662 - "GET / HTTP/1.1" 200 OK
2022-09-22 18:43:39 TRACE: 172.31.21.3:16662 - HTTP connection made
2022-09-22 18:43:39 TRACE: 172.31.21.3:16662 - ASGI [7] Started scope={'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.31.30.157', 8000), 'client': ('172.31.21.3', 16662), 'scheme': 'http', 'method': 'GET', 'root_path': '', 'path': '/', 'raw_path': b'/', 'query_string': b'', 'headers': '<...>'}
2022-09-22 18:43:39 TRACE: 172.31.21.3:16662 - ASGI [7] Send {'type': 'http.response.start', 'status': 200, 'headers': '<...>'}
2022-09-22 18:43:39 INFO: 172.31.47.71:3856 - "GET / HTTP/1.1" 200 OK
2022-09-22 18:43:39 TRACE: 172.31.47.71:3856 - ASGI [6] Send {'type': 'http.response.body', 'body': '<17 bytes>'}
2022-09-22 18:43:39 TRACE: 172.31.47.71:3856 - ASGI [6] Completed
2022-09-22 18:43:39 TRACE: 172.31.47.71:3856 - HTTP connection lost
2022-09-22 18:43:39 TRACE: 172.31.47.71:3856 - HTTP connection made
2022-09-22 18:43:39 TRACE: 172.31.47.71:3856 - ASGI [6] Started scope={'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.31.30.157', 8000), 'client': ('172.31.47.71', 3856), 'scheme': 'http', 'method': 'GET', 'root_path': '', 'path': '/', 'raw_path': b'/', 'query_string': b'', 'headers': '<...>'}
2022-09-22 18:43:39 TRACE: 172.31.47.71:3856 - ASGI [6] Send {'type': 'http.response.start', 'status': 200, 'headers': '<...>'}
2022-09-22 18:43:09 TRACE: 172.31.21.3:39448 - ASGI [5] Completed
2022-09-22 18:43:09 TRACE: 172.31.21.3:39448 - HTTP connection lost
2022-09-22 18:43:09 TRACE: 172.31.21.3:39448 - ASGI [5] Send {'type': 'http.response.body', 'body': '<17 bytes>'}
2022-09-22 18:43:09 TRACE: 172.31.21.3:39448 - HTTP connection made
2022-09-22 18:43:09 TRACE: 172.31.21.3:39448 - ASGI [5] Started scope={'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.31.30.157', 8000), 'client': ('172.31.21.3', 39448), 'scheme': 'http', 'method': 'GET', 'root_path': '', 'path': '/', 'raw_path': b'/', 'query_string': b'', 'headers': '<...>'}
2022-09-22 18:43:09 TRACE: 172.31.21.3:39448 - ASGI [5] Send {'type': 'http.response.start', 'status': 200, 'headers': '<...>'}
2022-09-22 18:43:09 INFO: 172.31.21.3:39448 - "GET / HTTP/1.1" 200 OK
2022-09-22 18:43:09 TRACE: 172.31.47.71:50778 - ASGI [4] Completed
2022-09-22 18:43:09 TRACE: 172.31.47.71:50778 - HTTP connection lost
2022-09-22 18:43:09 TRACE: 172.31.47.71:50778 - ASGI [4] Send {'type': 'http.response.start', 'status': 200, 'headers': '<...>'}
2022-09-22 18:43:09 INFO: 172.31.47.71:50778 - "GET / HTTP/1.1" 200 OK
2022-09-22 18:43:09 TRACE: 172.31.47.71:50778 - ASGI [4] Send {'type': 'http.response.body', 'body': '<17 bytes>'}
2022-09-22 18:43:09 TRACE: 172.31.47.71:50778 - HTTP connection made
2022-09-22 18:43:09 TRACE: 172.31.47.71:50778 - ASGI [4] Started scope={'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.31.30.157', 8000), 'client': ('172.31.47.71', 50778), 'scheme': 'http', 'method': 'GET', 'root_path': '', 'path': '/', 'raw_path': b'/', 'query_string': b'', 'headers': '<...>'}
2022-09-22 18:42:39 INFO: 172.31.47.71:55984 - "GET / HTTP/1.1" 200 OK
2022-09-22 18:42:39 TRACE: 172.31.47.71:55984 - ASGI [3] Send {'type': 'http.response.start', 'status': 200, 'headers': '<...>'}
2022-09-22 18:42:39 TRACE: 172.31.47.71:55984 - ASGI [3] Send {'type': 'http.response.body', 'body': '<17 bytes>'}
2022-09-22 18:42:39 TRACE: 172.31.47.71:55984 - ASGI [3] Completed
2022-09-22 18:42:39 TRACE: 172.31.47.71:55984 - HTTP connection lost
2022-09-22 18:42:39 TRACE: 172.31.21.3:59240 - HTTP connection lost
2022-09-22 18:42:39 TRACE: 172.31.47.71:55984 - ASGI [3] Started scope={'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.31.30.157', 8000), 'client': ('172.31.47.71', 55984), 'scheme': 'http', 'method': 'GET', 'root_path': '', 'path': '/', 'raw_path': b'/', 'query_string': b'', 'headers': '<...>'}
2022-09-22 18:42:39 TRACE: 172.31.21.3:59240 - ASGI [2] Completed
2022-09-22 18:42:39 TRACE: 172.31.47.71:55984 - HTTP connection made
2022-09-22 18:42:39 INFO: 172.31.21.3:59240 - "GET / HTTP/1.1" 200 OK
2022-09-22 18:42:39 TRACE: 172.31.21.3:59240 - ASGI [2] Send {'type': 'http.response.body', 'body': '<17 bytes>'}
2022-09-22 18:42:39 TRACE: 172.31.21.3:59240 - ASGI [2] Started scope={'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.1', 'server': ('172.31.30.157', 8000), 'client': ('172.31.21.3', 59240), 'scheme': 'http', 'method': 'GET', 'root_path': '', 'path': '/', 'raw_path': b'/', 'query_string': b'', 'headers': '<...>'}
2022-09-22 18:42:39 TRACE: 172.31.21.3:59240 - ASGI [2] Send {'type': 'http.response.start', 'status': 200, 'headers': '<...>'}
2022-09-22 18:42:39 TRACE: 172.31.21.3:59240 - HTTP connection made
2022-09-22 18:42:30 INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
2022-09-22 18:42:30 INFO: Waiting for application startup.
2022-09-22 18:42:30 TRACE: ASGI [1] Started scope={'type': 'lifespan', 'asgi': {'version': '3.0', 'spec_version': '2.0'}}
2022-09-22 18:42:30 TRACE: ASGI [1] Receive {'type': 'lifespan.startup'}
2022-09-22 18:42:30 TRACE: ASGI [1] Send {'type': 'lifespan.startup.complete'}
2022-09-22 18:42:30 INFO: Application startup complete.
2022-09-22 18:42:30 INFO: Started server process [1]

Any suggestions on how to solve this problem? Thanks!

JoN
  • 143
  • 10

3 Answers3

1

We've had this issue before and needed to add more RAM to the task. More CPU doesn't hurt either. Try giving your task way more memory and see if the problem persists. If it's working again then scale down the memory as long as it still works.

EDIT I asked a developer on my team and he said there is an idle time which you configure if the application is idle for an extended period of time it will shut down, you can set this property --timeout.

gunicorn \
    --log-config 'logging.conf'
    --timeout 6000
Derrops
  • 7,651
  • 5
  • 30
  • 60
  • Thanks for your suggestion. I checked that once more with 8GB RAM and more CPU: Max Utilization was at 6% and 3%. The behaviour didn't change. – JoN Sep 30 '22 at 00:33
  • Shouldn't the ```--timeout``` flag behaviour have the same purpose as ```--timeout-keep-alive```? If so, it unfortunately does not help either... – JoN Oct 01 '22 at 23:47
1

Have you tried tuning the uvicorn keep alive value as suggested in the other answer you linked?

CMD ["uvicorn", "--host", "0.0.0.0", "--port", "8000", "--log-level", "trace",  "--timeout-keep-alive", "65", "app.main:app"]
Ben Whaley
  • 32,811
  • 7
  • 87
  • 85
  • Thanks for your answer. Yes, I tried that; even with ```--timeout-keep-alive 1000``` nothing changed. – JoN Oct 01 '22 at 23:46
0

Turns out it was a matter of pure stupidity.

The simple issue were failing health checks in the Task Definition. While the health checks in the Target Group worked out perfectly since ports were mapped correctly, in the Task Definition a port 8000 was missing:

CMD-SHELL curl -f http://0.0.0.0:8000 || exit 1

I always thought it was the other way around and the uvicorn shutdown caused them to fail. Thanks for all your suggestions and patience!

Tyler2P
  • 2,324
  • 26
  • 22
  • 31
JoN
  • 143
  • 10
  • 1
    Where are you running that command? CMD-SHELL curl -f http://0.0.0.0:8000 || exit 1, it doesn't seem to match the Dockerfile available commands. – Cesar Flores Nov 16 '22 at 03:42