I have set up a liveness probe for a long running application in a pod. It failed a few times within a day causing the pod to be restarted a few times. There is no readiness probe.
livenessProbe:
httpGet:
path: /
port: http
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 20
periodSeconds: 20
successThreshold: 1
failureThreshold: 3
Further checking of the application code or docker image revealed nothing unusual. So I disabled the liveness probe, and manually probed the NodePort service every 10 secs using a python script from a PC connected to the network. The manual probe, though more frequent and more stringent than the liveness probe succeeded without failure. Each ping lasted about 200~400ms
The manual probe is about the same as a liveness probe of settings
timeoutSeconds: 500ms
periodSeconds: 10
successThreshold: 1
failureThreshold: 1
Why did it succeed while the liveness probe has failed? Does it indicate a k8s networking issue?
pod manifest:
kind: Pod
apiVersion: v1
metadata:
name: pypi-pypiserver-74b689df7-rh9bm
namespace: default
labels:
app.kubernetes.io/instance: pypi
app.kubernetes.io/name: pypiserver
spec:
volumes:
- name: secrets
secret:
secretName: pypi-pypiserver
defaultMode: 420
- name: packages
persistentVolumeClaim:
claimName: pypi-pypiserver
- name: default-token-cx7m7
secret:
secretName: default-token-cx7m7
defaultMode: 420
containers:
- name: pypiserver
image: 'registry.digitalocean.com/evergreen/pypiserver:latest'
args:
- run
- '--passwords=.'
- '--authenticate=.'
- '--port=8080'
- '--welcome=/dev/null'
- '--server=wsgiref'
- /data/packages
ports:
- name: http
containerPort: 8080
protocol: TCP
resources:
limits:
cpu: 1600m
memory: 1Gi
requests:
cpu: 400m
memory: 256Mi
volumeMounts:
- name: packages
mountPath: /data/packages
mountPropagation: None
- name: secrets
readOnly: true
mountPath: /config
- name: default-token-cx7m7
readOnly: true
mountPath: /var/run/secrets/kubernetes.io/serviceaccount
livenessProbe:
httpGet:
path: /
port: http
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 10
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
nodeSelector:
doks.digitalocean.com/node-pool: k8s-node-pool-hive-dev-2
serviceAccountName: default
serviceAccount: default
nodeName: k8s-node-pool-hive-dev-2-8adyc
securityContext:
runAsUser: 9898
runAsGroup: 9898
fsGroup: 9898
imagePullSecrets:
- name: evergreen
schedulerName: default-scheduler
tolerations:
- key: node.kubernetes.io/not-ready
operator: Exists
effect: NoExecute
tolerationSeconds: 300
- key: node.kubernetes.io/unreachable
operator: Exists
effect: NoExecute
tolerationSeconds: 300
priority: 0
enableServiceLinks: true
preemptionPolicy: PreemptLowerPriority