While using Anthos Config Management of GCP, i am repeatedly encounter errors that admission-webhook pods are dead in OOMKilled status. So I tried to manage memory request of pod spec, it seemed to work for a while but because all the fields of objects in config-management-system namespace are managed by a controller (ManagedFields), I couldn't change the spec of the admission-webhook permanently. I mean deployment spec is being reconciled to original spec.
Could anyone help me?
- Can I updated some part of managed fields by force? Will it be okay practically? Because I don't want to hack into Google provided pods just for updating resources spec of Pods.
- Even though I update managed field by force (
kubectl apply -f .. --force-conflicts --server-side
), original spec is restored by other manager. Is there any way to deal with current situation?
Pods status.
$ kubectl get pods -n config-management-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
admission-webhook-76b67c9f8c-vt8n6 0/1 CrashLoopBackOff 6 10m 10.40.0.76 gke-seoul-a-default-pool-e2dcfaa0-8tlw <none> <none>
admission-webhook-76b67c9f8c-wdlzg 0/1 CrashLoopBackOff 6 10m 10.40.1.95 gke-seoul-a-default-pool-e2dcfaa0-z2l0 <none> <none>
reconciler-manager-7f95dbf7-ss5z2 2/2 Running 0 6h57m 10.40.1.88 gke-seoul-a-default-pool-e2dcfaa0-z2l0 <none> <none>
root-reconciler-5ddb78479c-lmkmk 3/3 Running 0 6h53m 10.40.1.89 gke-seoul-a-default-pool-e2dcfaa0-z2l0 <none> <none>
And logs of dead pod.
kubectl logs -f admission-webhook-76b67c9f8c-vt8n6 -n config-management-system --previous
I0824 08:32:16.821366 1 setup.go:15] Build Version:
I0824 08:32:16.821436 1 deleg.go:130] setup "level"=0 "msg"="starting manager"
I0824 08:32:17.922013 1 request.go:655] Throttling request took 1.007744493s, request: GET:https://10.44.0.1:443/apis/coordination.k8s.io/v1beta1?timeout=32s
I0824 08:32:21.132165 1 deleg.go:130] controller-runtime/metrics "level"=0 "msg"="metrics server is starting to listen" "addr"=":8080"
I0824 08:32:21.132367 1 deleg.go:130] setup "level"=0 "msg"="creating certificate rotator for webhook"
I0824 08:32:21.132498 1 deleg.go:130] setup "level"=0 "msg"="starting manager"
I0824 08:32:21.132660 1 deleg.go:130] setup "level"=0 "msg"="waiting for certificate rotator"
I0824 08:32:21.132845 1 internal.go:385] controller-runtime/manager "level"=0 "msg"="starting metrics server" "path"="/metrics"
I0824 08:32:21.132937 1 controller.go:165] controller-runtime/manager/controller/cert-rotator "level"=0 "msg"="Starting EventSource" "source"={}
I0824 08:32:21.133137 1 controller.go:165] controller-runtime/manager/controller/cert-rotator "level"=0 "msg"="Starting EventSource" "source"={}
I0824 08:32:21.133295 1 deleg.go:130] cert-rotation "level"=0 "msg"="starting cert rotator controller"
I0824 08:32:21.233606 1 controller.go:173] controller-runtime/manager/controller/cert-rotator "level"=0 "msg"="Starting Controller"
I0824 08:32:21.233650 1 controller.go:211] controller-runtime/manager/controller/cert-rotator "level"=0 "msg"="Starting workers" "worker count"=1
I0824 08:32:21.234167 1 rotator.go:665] cert-rotation "level"=0 "msg"="Ensuring CA cert" "gvk"={"Group":"admissionregistration.k8s.io","Version":"v1","Kind":"ValidatingWebhookConfiguration"} "name"="admission-webhook.configsync.gke.io" "gvk"={"Group":"admissionregistration.k8s.io","Version":"v1","Kind":"ValidatingWebhookConfiguration"} "name"="admission-webhook.configsync.gke.io"
I0824 08:32:21.234564 1 deleg.go:130] cert-rotation "level"=0 "msg"="no cert refresh needed"
I0824 08:32:21.234611 1 deleg.go:130] cert-rotation "level"=0 "msg"="certs are ready in /certs"
I0824 08:32:21.239218 1 rotator.go:665] cert-rotation "level"=0 "msg"="Ensuring CA cert" "gvk"={"Group":"admissionregistration.k8s.io","Version":"v1","Kind":"ValidatingWebhookConfiguration"} "name"="admission-webhook.configsync.gke.io" "gvk"={"Group":"admissionregistration.k8s.io","Version":"v1","Kind":"ValidatingWebhookConfiguration"} "name"="admission-webhook.configsync.gke.io"
I0824 08:32:22.659789 1 deleg.go:130] cert-rotation "level"=0 "msg"="CA certs are injected to webhooks"
I0824 08:32:22.660065 1 deleg.go:130] setup "level"=0 "msg"="registering validating webhook"
Description.
Name: admission-webhook-76b67c9f8c-vt8n6
Namespace: config-management-system
Priority: 0
Node: gke-seoul-a-default-pool-e2dcfaa0-8tlw/10.178.0.6
Start Time: Tue, 24 Aug 2021 17:25:44 +0900
Labels: app=admission-webhook
pod-template-hash=76b67c9f8c
Annotations: <none>
Status: Running
IP: 10.40.0.76
IPs:
IP: 10.40.0.76
Controlled By: ReplicaSet/admission-webhook-76b67c9f8c
Containers:
admission-webhook:
Container ID: containerd://95084e16bedb0f75ed18c24b2b16a815dda0fef87bbaea00df606b9093cca197
Image: gcr.io/config-management-release/admission-webhook:v1.8.1-rc.2
Image ID: gcr.io/config-management-release/admission-webhook@sha256:89783f083940d75cc4b7c51428966751a29e7edd606872b23be090a0a1655ecc
Port: 10250/TCP
Host Port: 0/TCP
Command:
/admission-webhook
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Fri, 27 Aug 2021 20:09:53 +0900
Finished: Fri, 27 Aug 2021 20:10:04 +0900
Ready: False
Restart Count: 854
Limits:
cpu: 200m
memory: 100Mi
Requests:
cpu: 100m
memory: 20Mi
Environment: <none>
Mounts:
/certs from cert (ro)
/var/run/secrets/kubernetes.io/serviceaccount from admission-webhook-token-gt7qz (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
cert:
Type: Secret (a volume populated by a Secret)
SecretName: admission-webhook-cert
Optional: false
admission-webhook-token-gt7qz:
Type: Secret (a volume populated by a Secret)
SecretName: admission-webhook-token-gt7qz
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 3m25s (x19992 over 3d2h) kubelet Back-off restarting failed container
Info.
- ACM Version 1.8.1
- Unstructured, based on Github with Token.
- Github repository is for testing purpose only with a tiny configmap; synced well, but admission webhook cannot stop me from deleting configmap with kubectl even though it restored well right after the deletion.
- Reinstalled few times from the scratch.