0

so I have code that at startup need to load the Machine Learning Model that got a 1 pickle about 100MB, in my case I need to load 6 pickle file (around 600mb++), and I use FastAPI for my API code with Uvicorn and Gunicorn.

So, what I want to ask is why my gunicorn always get a 504 error status code when I open my URL from Cloud Runabout 15 seconds for the first time, and after that the URL can be opened without an error. But after I leave it without opening the URL for about 30-60 minutes, it will return the 504 error again? Is my Gunicorn dead/shutdown? Because when i check it from my Cloud Run log, my gunicorn got a Shutting down, and I think my gunicorn was dead. So I need to keep my Gunicorn always on, but how can I make it to set my gunicorn always on?

This is how my pickle load :

# Load all model
@app.on_event("startup")
async def load_model():
    # Pathfile
    pathfile_model = os.path.join("modules", "model/")
    pathfile_data = os.path.join("modules", "data/")

    start_time = time.time()

    # Load Model
    usedcar.price_engine_4w = {}
    top5_brand = ["honda", "toyota", "nissan", "suzuki", "daihatsu"]
    for i in top5_brand:
        with open(pathfile_model + f'{i}_all_in_one.pkl', 'rb') as file:
            usedcar.price_engine_4w[i] = pickle.load(file)
    with open(pathfile_model + 'ex_Top5_all_in_one.pkl', 'rb') as file:
        usedcar.price_engine_4w['non'] = pickle.load(file)

    # Load Dataset Match
    with open(pathfile_data + settings.DATA_LIST) as path:
        usedcar.list_match_seva = pd.read_csv(path)

    elapsed_time = time.time() - start_time

    print("======================================")
    print("INFO  : Model loaded Succesfully")
    print("MODEL :", usedcar.price_engine_4w)
    print("ELAPSED MODEL TIME : ", elapsed_time)

Here are how my main.py code run :

if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8080, log_level="info", loop=asyncio)

This is my Dockerfile :

FROM python:3.8-slim-buster
RUN apt-get update --fix-missing
RUN DEBIAN_FRONTEND=noninteractive apt-get install -y libgl1-mesa-dev python3-pip git
RUN mkdir /usr/src/app
WORKDIR /usr/src/app
COPY ./requirements.txt /usr/src/app/requirements.txt
RUN pip3 install -U setuptools
RUN pip3 install --upgrade pi
RUN pip3 install -r ./requirements.txt --use-feature=2020-resolver
COPY . /usr/src/app
CMD exec gunicorn --bind :8080 --workers 2 --threads 4 main:app --worker-class uvicorn.workers.UvicornH11Worker --preload --timeout 60 --worker-tmp-dir /dev/shm

This is my requirements for uvicorn and gunicorn :

fastapi
fastapi-utils
uvicorn[standard]
gunicorn

This is my Cloud Run Log :

2021-02-15 14:31:54.346 WIT [2021-02-15 07:31:54 +0000] [1] [INFO] Handling signal: term
2021-02-15 14:31:54.385 WIT [2021-02-15 07:31:54 +0000] [11] [INFO] Shutting down
2021-02-15 14:31:54.386 WIT [2021-02-15 07:31:54 +0000] [12] [INFO] Shutting down
2021-02-15 14:31:54.486 WIT [2021-02-15 07:31:54 +0000] [11] [INFO] Waiting for application shutdown.
2021-02-15 14:31:54.486 WIT [2021-02-15 07:31:54 +0000] [11] [INFO] Application shutdown complete.
2021-02-15 14:31:54.486 WIT [2021-02-15 07:31:54 +0000] [12] [INFO] Waiting for application shutdown.
2021-02-15 14:31:54.486 WIT [2021-02-15 07:31:54 +0000] [11] [INFO] Finished server process [11]
2021-02-15 14:31:54.487 WIT [2021-02-15 07:31:54 +0000] [11] [INFO] Worker exiting (pid: 11)
2021-02-15 14:31:54.487 WIT ======================================
2021-02-15 14:31:54.487 WIT INFO : Model loaded Succesfully
2021-02-15 14:31:54.487 WIT ELAPSED MODEL TIME : 13.514873743057251
2021-02-15 14:31:54.487 WIT INFO : Master Data Updated Succesfully
2021-02-15 14:31:54.487 WIT ELAPSED DATABASE TIME : 0.5247213840484619
2021-02-15 14:31:54.487 WIT ======================================
2021-02-15 14:31:54.487 WIT [2021-02-15 07:31:54 +0000] [12] [INFO] Application shutdown complete.
2021-02-15 14:31:54.487 WIT [2021-02-15 07:31:54 +0000] [12] [INFO] Finished server process [12]
2021-02-15 14:31:54.487 WIT [2021-02-15 07:31:54 +0000] [12] [INFO] Worker exiting (pid: 12)

As we can see, from my cloud run log, my gunicorn was shutdown suddenly.

And this is my Error :

1 Screenshot from 2020-12-15 16-34-03

After I looked around, I've tried a few things like:

  1. --worker-tmp-dir /dev/shm (I use this line because I think there might be blocking from my Docker Container, so I add this line to make sure there's no blocking from Docker Container, but it still gets a 504 status). Source 1 Source 2
  2. --preload (I use this because I think my cloud run need to save some RAM to start my Gunicorn fastly, in case if my Gunicorn shutdown, then when I load my page again, it will load faster, but it still doesn't effect). Source
  3. I used my worker=2, thread=4, graceful_timeout=100, but it still makes my Cloud Run shutdown.

Thank you

MADFROST
  • 1,043
  • 2
  • 11
  • 29

1 Answers1

1

By default Cloud Run is fully managed, what means that the number of instances will be automatically scaled to zero. The next request will trigger the start of a new instance. In this moment your Docker image will be loaded and initialized. This process takes time and is called the warmup process. In your case the process takes more time than your configured requests timeouts.

You have different possibilities to improve the process:

  • Make your Docker image smaller (e.g. *-alpine as base images)
  • Clean caches of apt and pip
  • Remove gunicorn and scale by Cloud Run
  • Use for each model a Cloud Run service (orchestration of microservices)

No good idea:

Darius
  • 10,762
  • 2
  • 29
  • 50
  • So, gunicorn configuration doesn't give any effect to my startup load? – MADFROST Feb 16 '21 at 08:01
  • 1
    `gunicorn` is lightweight. In your position I would test different webservers and scaling plans (web workers, num instances, ...). But don't forget to track the used memory to avaoid OOM (out of memory) errors. – Darius Feb 16 '21 at 08:27