How to implement cluster autosclaer in EKS for multiple shards across different nodegroups?

Question

I am using EKS for my application. I have created multiple nodegoups NG-1 and NG-2. I am also implementing Cluster Autoscaler which would increment the nodes based upon load. I have a use case where I want that the CA should increase nodes in NG-1 for specific case and it should increase NG-2 for other case. For this I thought of implementing the "sharding across node groups" feature of Cluster Autoscaler. But I am not able to implement multiple CA pods for this. The two CA pods start each having separate configuration for one Nodegroup(NG-1 or NG-2) but only one is able to interact with the master and the other one goes into crashloopback state with the following error:

"Failed to get nodes from apiserver: nodes is forbidden: User "system:serviceaccount:kube-system2:cluster-autoscaler" cannot list resource "nodes" in API group "" at the cluster scope"

I have created multiple IAM roles with oidc-webidentity for each of the CA pod. Also I have created 2 different namespaces for two CA deployments which carries all the RBAC Authorization.

I see that the the pod tries to connect to the kube-system namespace and is unable to do so. I just dont understand what is possible reason and why at a given time only one CA pod can communicate with the master.

I need some suggestion on this as to how to implement the CA for multiple pods and if there is an alternative then that would be good.

The two namespaces are kube-system1 and kube-system2 based on which I created the roles .

This is the link for the same:

https://aws.github.io/aws-eks-best-practices/cluster-autoscaling/cluster-autoscaling/#sharding-across-node-groups

---
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
  name: cluster-autoscaler
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
  - apiGroups: [""]
    resources: ["events", "endpoints"]
    verbs: ["create", "patch"]
  - apiGroups: [""]
    resources: ["pods/eviction"]
    verbs: ["create"]
  - apiGroups: [""]
    resources: ["pods/status"]
    verbs: ["update"]
  - apiGroups: [""]
    resources: ["endpoints"]
    resourceNames: ["cluster-autoscaler"]
    verbs: ["get", "update"]
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["watch", "list", "get", "update"]
  - apiGroups: [""]
    resources:
      - "pods"
      - "services"
      - "replicationcontrollers"
      - "persistentvolumeclaims"
      - "persistentvolumes"
    verbs: ["watch", "list", "get"]
  - apiGroups: ["extensions"]
    resources: ["replicasets", "daemonsets"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["policy"]
    resources: ["poddisruptionbudgets"]
    verbs: ["watch", "list"]
  - apiGroups: ["apps"]
    resources: ["statefulsets", "replicasets", "daemonsets"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses", "csinodes"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["batch", "extensions"]
    resources: ["jobs"]
    verbs: ["get", "list", "watch", "patch"]
  - apiGroups: ["coordination.k8s.io"]
    resources: ["leases"]
    verbs: ["create"]
  - apiGroups: ["coordination.k8s.io"]
    resourceNames: ["cluster-autoscaler"]
    resources: ["leases"]
    verbs: ["get", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["create","list","watch"]
  - apiGroups: [""]
    resources: ["configmaps"]
    resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
    verbs: ["delete", "get", "update", "watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-autoscaler
subjects:
  - kind: ServiceAccount
    name: cluster-autoscaler
    namespace: kube-system

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: cluster-autoscaler
subjects:
  - kind: ServiceAccount
    name: cluster-autoscaler
    namespace: kube-system

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system-1
  labels:
    app: cluster-autoscaler
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
      annotations:
        prometheus.io/scrape: 'true'
        prometheus.io/port: '8085'
    spec:
      serviceAccountName: cluster-autoscaler
      containers:
        - image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.17.3
          name: cluster-autoscaler
          resources:
            limits:
              cpu: 100m
              memory: 300Mi
            requests:
              cpu: 100m
              memory: 300Mi
          command:
            - ./cluster-autoscaler
            - --v=4
            - --stderrthreshold=info
            - --cloud-provider=aws
            - --skip-nodes-with-local-storage=false
            - --expander=least-waste
            - --node=1:3:Asg-Name
          volumeMounts:
            - name: ssl-certs
              mountPath: /etc/ssl/certs/ca-certificates.crt #/etc/ssl/certs/ca-bundle.crt for Amazon Linux Worker Nodes
              readOnly: true
          imagePullPolicy: "Always"
      volumes:
        - name: ssl-certs
          hostPath:
            path: "/etc/ssl/certs/ca-bundle.crt"

Did you try to explicitly set `--namespace=` flag in CA deployment? https://github.com/kubernetes/autoscaler/blob/43ab0309697271e6b2ad82dd4fc3a28132456399/cluster-autoscaler/FAQ.md#what-are-the-parameters-to-ca It may be easier if you actually provided the yaml files you used — Matt, Mar 24 '21 at 12:42
the deployment is running in `kube-system-1` namespace, but all other have `namespace: kube-system`. change all occurences of `kube-system` to `kube-system-1` and let me know if it works or not — Matt, Mar 25 '21 at 09:07
Yeah, I changed the all the occurrence to kube-system-1 and kube-system-2 and then I deployed the CA again. That is when the error occurred. I also created two IAM roles for both the CA deployments. That is where only one shows up and the other one goes into crashloopback with the error described above. This is the issue. Both of the CA pods are trying to communicate to configmaps in Kube-system namespace even though they are in a different one and are not able to do so and thus throwing error. — Kaustubh, Mar 25 '21 at 10:44
okay so after making the changes the I will need to set the namespace externally as well right? — Kaustubh, Mar 25 '21 at 11:27
@Matt Failed to get nodes from apiserver: nodes is forbidden: User "system:serviceaccount:kube-system:cluster-autoscaler" cannot list resource "nodes" in API group "" at the cluster scope . One starts but the other one crashes as usual with this error in the logs. kube-system is one ns and other one is kube-system-1. — Kaustubh, Mar 25 '21 at 16:55
Failed to retrieve status configmap for update: configmaps "cluster-autoscaler-status" is forbidden: User "system:serviceaccount:kube-system-1:cluster-autoscaler" cannot get resource "configmaps" in API group "" in the namespace "kube-system" Even though the CA pod runs in kube-system-1 but it still shows this error in logs. The same thing happened earlier as well. — Kaustubh, Mar 25 '21 at 17:04
@Kaustubh - did you end up figuring this out ? would really appreciate it if you can post the solution to this as I am facing the same issue — marwan, Jun 09 '21 at 14:47
@marwan No, I did not find any solution to this. It looked pretty complex and hence we chose to go ahead with a different approach. — Kaustubh, Jun 09 '21 at 18:14

How to implement cluster autosclaer in EKS for multiple shards across different nodegroups?

0 Answers0