kube-proxy changes reverting after couple of minutes on my AKS cluster

Question

I am experimenting and tweaking a bit on my sandbox AKS cluster with the intention to configure it in a production ready state. Regarding that, I am following a book where the writer is redeployig the initial kube-proxy daemonset with some modification (the only difference is that he is doing it on AWS EKS).

The problem is that the daemonset and pod are getting to the initial state after 2-3 minutes. AKS is just doing a rollback, what I can se when execute the rollback command

> kubectl rollout history daemonset kube-proxy -n kube-system
daemonset.apps/kube-proxy 
REVISION  CHANGE-CAUSE
2         <none>
8         <none>
10        <none>
14        <none>
16        <none>

I tried to redeploy the daemonset with my minor changes (changed cpu from 100m to 120m and changed the -v flag from 3 to 2) declaretively by applying following manifest

---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    component: kube-proxy
    tier: node
    deployment: custom
  name: kube-proxy
  namespace: kube-system
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      component: kube-proxy
      tier: node
  template:
    metadata:
      creationTimestamp: null
      labels:
        component: kube-proxy
        tier: node
        deployedBy: Luka
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.azure.com/cluster
                operator: Exists
              - key: type
                operator: NotIn
                values:
                - virtual-kubelet
              - key: kubernetes.io/os
                operator: In
                values:
                - linux
      containers:
      - command:
        - kube-proxy
        - --conntrack-max-per-core=0
        - --metrics-bind-address=0.0.0.0:10249
        - --kubeconfig=/var/lib/kubelet/kubeconfig
        - --cluster-cidr=10.244.0.0/16
        - --detect-local-mode=ClusterCIDR
        - --pod-interface-name-prefix=
        - --v=2
        image: mcr.microsoft.com/oss/kubernetes/kube-proxy:v1.23.12-hotfix.20220922.1
        imagePullPolicy: IfNotPresent
        name: kube-proxy
        resources:
          requests:
            cpu: 120m
        securityContext:
          privileged: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /var/lib/kubelet
          name: kubeconfig
          readOnly: true
        - mountPath: /etc/kubernetes/certs
          name: certificates
          readOnly: true
        - mountPath: /run/xtables.lock
          name: iptableslock
        - mountPath: /lib/modules
          name: modules
      dnsPolicy: ClusterFirst
      hostNetwork: true
      initContainers:
      - command:
        - /bin/sh
        - -c
        - |
          SYSCTL=/proc/sys/net/netfilter/nf_conntrack_max
          echo "Current net.netfilter.nf_conntrack_max: $(cat $SYSCTL)"
          DESIRED=$(awk -F= '/net.netfilter.nf_conntrack_max/ {print $2}' /etc/sysctl.d/999-sysctl-aks.conf)
          if [ -z "$DESIRED" ]; then
            DESIRED=$((32768*$(nproc)))
            if [ $DESIRED -lt 131072 ]; then
              DESIRED=131072
            fi

            echo "AKS custom config for net.netfilter.nf_conntrack_max not set."
            echo "Setting nf_conntrack_max to $DESIRED (32768 * $(nproc) cores, minimum 131072)."
            echo $DESIRED > $SYSCTL
          else
            echo "AKS custom config for net.netfilter.nf_conntrack_max set to $DESIRED."
            echo "Setting nf_conntrack_max to $DESIRED."
            echo $DESIRED > $SYSCTL
          fi
        image: mcr.microsoft.com/oss/kubernetes/kube-proxy:v1.23.12-hotfix.20220922.1
        imagePullPolicy: IfNotPresent
        name: kube-proxy-bootstrap
        resources:
          requests:
            cpu: 100m
        securityContext:
          privileged: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/sysctl.d
          name: sysctls
        - mountPath: /lib/modules
          name: modules
      priorityClassName: system-node-critical
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      tolerations:
      - key: CriticalAddonsOnly
        operator: Exists
      - effect: NoExecute
        operator: Exists
      - effect: NoSchedule
        operator: Exists
      volumes:
      - hostPath:
          path: /var/lib/kubelet
          type: ""
        name: kubeconfig
      - hostPath:
          path: /etc/kubernetes/certs
          type: ""
        name: certificates
      - hostPath:
          path: /run/xtables.lock
          type: FileOrCreate
        name: iptableslock
      - hostPath:
          path: /etc/sysctl.d
          type: Directory
        name: sysctls
      - hostPath:
          path: /lib/modules
          type: Directory
        name: modules
  updateStrategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
    type: RollingUpdate
status:
  currentNumberScheduled: 4
  desiredNumberScheduled: 4
  numberAvailable: 4
  numberMisscheduled: 0
  numberReady: 4
  observedGeneration: 1
  updatedNumberScheduled: 4

I tried it also by removing the initContainer. Even the solution by editing the daemonset, explained in this stackoverlow post didnt worked.

Do I miss something? Why is the kube-proxy daemonset always rolling back?

score 0 · Answer 1 · answered Dec 06 '22 at 07:08

In Kubernetes rolling updates are the default strategy to update running version of the application

When I upgrade the pods from version 1 to 2 the deployment will creates the new ReplicaSet and increase the count of replicas and previous count goes to 0

After rolling update, the previous replica set is not deleted If we try to execute another rolling update from version 2 to 3 we might notice that at the end of the upgrade we have two replica sets with 0 count

I have created the deployment file and deployed when I check the history of the daemonset I am able to see below results

kubectl rollout history daemonset kube-proxy -n kube-system

enter image description here

We can rollback to the specific version

kubectl rollout undo daemonset kube-proxy --to-revision=4 -n kube-system

enter image description here

After undo changes my replica revision changes to my daemonset look like below

kubectl rollout history daemonset kube-proxy -n kube-system

enter image description here

In the above command we have two columns 1 is revision and another is change-cause and it is always set to none

I have set the change-cause to 'Kube' as mentioned below and got below results

If I try to get the rollout history again

kubernetes.io/change-cause: "Kube"   #for particular revision
kubectl apply -f filename
kubectl rollout history daemonset kube-proxy -n kube-system

enter image description here

Reference: To know more about the rolling updates use this kubernetes link

I think I understand the rollout concept, but the problem is that the rollback to a previous (default) version is done without my interaction (automatically). — Luka Klarić, Dec 13 '22 at 22:19

kube-proxy changes reverting after couple of minutes on my AKS cluster

1 Answers1