Why would k8s wait for all containers in a pod to start before restarting a failed/exited one?

Question

No documentation mentions this behaviour and I find it very peculiar that k8s won't restart a failed container in a pod before all containers are started. I'm using a sidecar to the main container. The latter needs to restart itself at pod startup. After that the sidecar will run send some requests to the main container and continue to serve traffic further on.

However this all gets stuck with the first container not being restarted, i.e. startup/live/ready probes never kick in. Thus my questions are:

Why does this happen?
Where is it documented?
Can I circumvent this behaviour (i.e. make k8s restart my main container without decoupling the 2 containers into 2 distinct pods)?

Here's a small deployment yaml to illustrate the issue:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-test
  labels:
    app: nginx
spec:
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      restartPolicy: Always
      containers:
      - name: nginx
        image: nginx:1.14.2
        livenessProbe:
          tcpSocket:
            port: 80
        startupProbe:
          tcpSocket:
            port: 80
        command:
          - bash
          - -c
          - echo exit 1; exit 1
      - name: nginx2
        image: nginx:1.14.2
        lifecycle:
          postStart:
            exec:
              command:
                - bash
                - -c
                - while true; do sleep 1; echo .; done

I expect the restart counters to increase:

$ k describe pod -l app=nginx | grep Restart
    Restart Count:  0
    Restart Count:  0

What makes this annoying is the fact that k8s won't publish container stdout logs until the whole pod starts:

$ k logs --all-containers -l app=nginx
Error from server (BadRequest): container "nginx" in pod "nginx-test-cd5c64644-b48hj" is waiting to start: ContainerCreating

My real life example is percona (cluster) node with a proxysql sidecar. FWIW, all containers have "proper" live/ready/startup probe checks.

Why would k8s wait for all containers in a pod to start before restarting a failed/exited one?

0 Answers0