3

I have a python docker container running in kuberetes. The code's general workflow is that it receives messages and then kicks off a series of long-running SQL Server statements via pyodbc.

My goal is to increase the kubernetes timeout and intercept the shutdown signal so that we can finish our SQL statements before shutting down the pod. First, I set terminationGracePeriodSeconds: 1800 in my k8s.yaml file to make sure the pod has time to finish running the queries.

I've set up the code in python:

    def graceful_shutdown(_signo, _stack_frame):
        _logger.info(
            f"Received {_signo}, attempting to let the queries finish gracefully before shutting down."
        )
        # shutdown_event is a threading.Event object
        shutdown_event.set()

And then in the main execution loop:

    signal.signal(signal.SIGTERM, graceful_shutdown)
    signal.signal(signal.SIGINT, graceful_shutdown)
    signal.signal(signal.SIGHUP, graceful_shutdown)

(Tangentially, a good discussion as to why I get SIGHUP and not SIGTERM here: 1)

This works as expected. I get the SIGHUP signal, the Threading.Event shutdown_event object tells my main execution loop to terminate, everything works...

...Except when one of those SQL queries is actually running. In that case, my graceful_shutdown() method doesn't get called. What happens is that the pyodbc connection throws the following error:

pyodbc.OperationalError: ('08S01', '[08S01] [Microsoft][ODBC Driver 17 for SQL Server]TCP Provider: Error code 0x2714 (10004) (SQLExecDirectW)')

This error is a communication link failure caused by Kubernetes telling the pod to terminate.

My question is, can I prevent this error from being raised so that pyodbc can finish running the query?

Max
  • 666
  • 1
  • 6
  • 23
  • 1
    Does this need to be a deployment? Potentially you can define this as a statefulset – Clifford Cheefoon Dec 16 '22 at 19:53
  • Clifford - that's an interesting idea but it would be an uphill fight against the infrastructure team, which currently only supports deployments. – Max Dec 16 '22 at 20:55
  • 1
    Sounds like other parts of the pod may be reacting to `SIGTERM` before your main process and closing network connections. Have you considered a `STOPSIGNAL` handler in the image, or `preStop` hooks? Ref: [Termination of pods](https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination) – AlwaysLearning Dec 17 '22 at 00:12

0 Answers0