0

I have production version of my config. But some of my request to server can take more than 1, 2, 10, 15 seconds. It is random. 1 of 20 requests are like this. I have had good server: 8RAM, 4CPU. Problem in my code.

How do I set it up for production?

My architecture: server NGINX -> docker NGINX -> uvicorn -> FastAPI app

server NGINX config:

server {
listen 80;
server_name blabla.com;

location / {
    proxy_pass http://0.0.0.0:8040$request_uri;
    proxy_set_header HOST $host;
}

Docker NGINX config:

user www-data;
pid /run/nginx.pid;

events {
    # multi_accept on;
}

http {
    # Basic settings
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    types_hash_max_size 20480;
    client_max_body_size 30m;
    # access_log off;

    #
    include /etc/nginx/mime.types;
    default_type application/octet-stream;

    # GZIP
    gzip on;

    server {
        listen 80;
        server_name ${EXTERNAL_HOST};

        access_log /data/logs/nginx.log;
        error_log /data/logs/nginx.err warn;

        root /;

        location /api/ {
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header Host $http_host;
            proxy_pass http://api:5000/;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "Upgrade";
        }
    }
}

Dockerfile:

FROM python:3.10
WORKDIR .

ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

COPY ./requirements.txt .
RUN pip install -r requirements.txt

COPY . .
ARG PROTOCOL HOST
ENV SERVER "${PROTOCOL}://${HOST}/api/"
ENV CLIENT "${PROTOCOL}://${HOST}/"

docker-compose config:

api:
  image: blabla/api
  build:
    dockerfile: ../docker/api/Dockerfile
    context: ../api
    args:
      - PROTOCOL=${PROTOCOL}
      - HOST=${EXTERNAL_HOST}
  restart: unless-stopped
  env_file: .env
  volumes:
    - ../data/load:/data/load
    - type: bind
      source: ../data/logs/api.log
      target: /app.log
  deploy:
    mode: replicated
    replicas: 1
    resources:
      limits:
        cpus: "0.75"
        memory: 1500M
      reservations:
        cpus: "0.25"
        memory: 500M
  command: uvicorn app:app --host 0.0.0.0 --port 5000 --proxy-headers

app.py

from fastapi import FastAPI, Request
app = FastAPI(title='Web app')

from fastapi.middleware.cors import CORSMiddleware
app.add_middleware(
    CORSMiddleware,
    allow_origins=['*'],
    allow_credentials=True,
    allow_methods=['*'],
    allow_headers=['*'],
)

@app.post('/')
async def index(data: Input, request: Request):
    return {'bla': 'bla'}
Alex Poloz
  • 366
  • 7
  • 22
  • i don't see any workers configuration here ? how many app instance are you spawning ? Also performance problems can come from many thing, even your code and without is it's going to be complicted – Bastien B Aug 04 '22 at 08:20

1 Answers1

0

Here it can be one or a combination of those things that cause long running tasks and blocking I/O:

- sync route
- blocking I/O function or treatment in your route function
- bad worker class configuration for your type of treatment
- bad docker configuration
- not enough workers

First thing i would suggest is manage multiple app instance with gunicorn,you will not have a lot of config to do since it come with a worker class specifically for uvicorn.

from the doc:

gunicorn app:app --workers 4 --worker-class uvicorn.workers.UvicornWorker --bind 0.0.0.0:5000

(i'm not sure how to set proxy-headers in gunicorn but here is the doc)

you can also spawn multiple workers with uvicorn:

uvicorn app:app --host 0.0.0.0 --port 5000 --workers 4 --proxy-headers

Apart from spawning multiple workers it will be helpful to see your container cpu/memory usage during your treatments, it can be part of the bottleneck.

I don't know what your api does but if you are doing nlp stuff a lot of time you have long running tasks and to manage asynchronous task queue or job queue like celery.

Bastien B
  • 1,018
  • 8
  • 25