Stopping Python container is slow - SIGTERM not passed to python process?

Question

I made a simple python webserver based on this example, which runs inside Docker

FROM python:3-alpine
WORKDIR /app

COPY entrypoint.sh .
RUN chmod +x entrypoint.sh

COPY src src
CMD ["python", "/app/src/api.py"]
ENTRYPOINT ["/app/entrypoint.sh"]

Entrypoint:

#!/bin/sh
echo starting entrypoint
set -x
exec "$@"

Stopping the container took very long, altough the exec statement with the JSON array syntax should pass it to the python process. I assumed a problem with SIGTERM no being passed to the container. I added the following to my api.pyscript to detect SIGTERM

def terminate(signal,frame):
  print("TERMINATING")

if __name__ == "__main__":
    signal.signal(signal.SIGTERM, terminate)

    webServer = HTTPServer((hostName, serverPort), MyServer)
    print("Server started http://%s:%s" % (hostName, serverPort))
    webServer.serve_forever()

Executed without Docker python3 api/src/api.py, I tried

kill -15 $(ps -guaxf | grep python | grep -v grep | awk '{print $2}')

to send SIGTERM (15 is the number code of it). The script prints TERMINATING, so my event handler works. Now I run the Docker container using docker-compose and press CTRL + C. Docker says gracefully stopping... (press Ctrl+C again to force) but doesn't print my terminating message from the event handler.

I also tried to run docker-compose in detached mode, then run docker-compose kill -s SIGTERM api and view the logs. Still no message from the event handler.

Try something like `import os` then `print('My process id is: {}'.format(os.getpid()))`. Docker only waits for process id 1 to cleanly shutdown (not sure about compose, haven't used it extensively). If your log doesn't contain "My process id is: 1", its likely that the process isn't being given time to respond to a sigterm and is not cleanly shutting down. — joshua.software.dev, Jul 11 '20 at 19:57
Also be aware of the different behavior of a `ENTRYPOINT python main.py` and a `ENTRYPOINT ["python", "main.py"]`. The first runs Python with PID 1 (so receiving all signals), the latter runs a shell with PID 1, so **your Python script will never see any signals**. — Tom Pohl, Jul 22 '22 at 05:43

score 16 · Answer 1 · answered Jul 13 '20 at 08:00

Since the script runs as pid 1 as desired and setting init: true in docker-compose.yml doesn't seem to change anything, I took a deeper drive in this topic. This leads me figuring out multiple mistakes I did:

Logging

The approach of printing a message when SIGTERM is catched was designed as simple test case to see if this basically works before I care about stopping the server. But I noticed that no message appears for two reasons:

Output buffering

When running a long term process in python like the HTTP server (or any while True loop for example), there is no output displayed when starting the container attached with docker-compose up (no -d flag). To receive live logs, we need to start python with the -u flag or set the env variable PYTHONUNBUFFERED=TRUE.

No log piping after stop

But the main problem was not the output buffering (this is just a notice since I wonder why there was no log output from the container). When canceling the container, docker-compose stops piping logs to the console. This means that from a logical perspective it can't display anything that happens AFTER CTRL + C is pressed.

To fetch those logs, we need to wait until docker-compose has stopped the container and run docker-compose logs. It will print all, including those generated after CTRL + C is pressed. Using docker-compose logs I found out that SIGTERM is passed to the container and my event handler works.

Stopping the webserver

With those knowledge I tried to stop the webserver instance. First this doesn't work because it's not enough to just call webServer.server_close(). Its required to exit explicitely after any cleanup work is done like this:

def terminate(signal,frame):
  print("Start Terminating: %s" % datetime.now())
  webServer.server_close()
  sys.exit(0)

When sys.exit() is not called, the process keeps running which results in ~10s waiting time before Docker kills it.

Full working example

Here a demo script that implement everything I've learned:

from http.server import BaseHTTPRequestHandler, HTTPServer
import signal
from datetime import datetime
import sys, os

hostName = "0.0.0.0"
serverPort = 80

class MyServer(BaseHTTPRequestHandler):
  def do_GET(self):
    self.send_response(200)
    self.send_header("Content-Type", "text/html")
    self.end_headers()
    self.wfile.write(bytes("Hello from Python Webserver", "utf-8"))

webServer = None

def terminate(signal,frame):
  print("Start Terminating: %s" % datetime.now())
  webServer.server_close()
  sys.exit(0)

if __name__ == "__main__":
    signal.signal(signal.SIGTERM, terminate)

    webServer = HTTPServer((hostName, serverPort), MyServer)
    print("Server started http://%s:%s with pid %i" % ("0.0.0.0", 80, os.getpid()))
    webServer.serve_forever()

Running in a container, it could be stopped very fast without waiting for Docker to kill the process:

$ docker-compose up --build -d
$ time docker-compose down
Stopping python-test_app_1 ... done
Removing python-test_app_1 ... done
Removing network python-test_default

real    0m1,063s
user    0m0,424s
sys     0m0,077s

You haven't gotten much love for this answer, but thank you. This kind of thing needs to be documented somewhere. It's going to save me a lot of time in the long-run. — Vorticity, Sep 14 '21 at 03:52
Also, you should accept your own answer so that it is easier to find. — Vorticity, Sep 14 '21 at 04:21

score 10 · Answer 2 · answered Jul 11 '20 at 20:29

Docker runs your application, per default, in foreground, so, as PID 1, this said, the process with the PID 1 as a special meaning and specific protections in Linux.

This is highlighted in docker run documentation:

Note

A process running as PID 1 inside a container is treated specially by Linux: it ignores any signal with the default action. As a result, the process will not terminate on SIGINT or SIGTERM unless it is coded to do so.

^{Source: https://docs.docker.com/engine/reference/run/#foreground}

In order to fix this, you can run the container, in a single container mode, with the flag --init of docker run:

You can use the --init flag to indicate that an init process should be used as the PID 1 in the container. Specifying an init process ensures the usual responsibilities of an init system, such as reaping zombie processes, are performed inside the created container.

^{Source: https://docs.docker.com/engine/reference/run/#specify-an-init-process}

The same configuration is possible in docker-compose, simply by specifying init: true on the container.

An example would be:

version: "3.8"
services:
  web:
    image: alpine:latest
    init: true

^{Source: https://docs.docker.com/compose/compose-file/#init}