0

I have a problem with one of the pods. It says that it is in a pending state.

kubectl get pods -n amazon-cloudwatch                           
NAME    READY   STATUS    RESTARTS   AGE
pod-1   1/1     Running   0          17h
pod-2   1/1     Running   0          17h
pod-3   1/1     Running   0          17h
pod-4   1/1     Running   0          17h
pod-5   1/1     Running   0          17h
pod-6   0/1     Pending   0          17h

If I describe the pod, this is what I can see:

Events:
  Type     Reason            Age                   From               Message
  ----     ------            ----                  ----               -------
  Warning  FailedScheduling  96s (x1011 over 17h)  default-scheduler  0/6 nodes are available: 1 Too many pods, 5 node(s) didn't match Pod's node affinity/selector.

In my pod YAML file, node-Selectors are defined as below.

Node-Selectors:              kubernetes.io/os=linux

I am trying to set container insight by following steps in below mentioned link https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-setup-metrics.html

Here, in this file Node selectors are mentioned.

kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/cwagent/cwagent-daemonset.yaml

All my nodes are labeld with the kubernetes.io/os=linux

 kubectl get nodes --show-labels                                               
NAME    STATUS   ROLES    AGE     VERSION               LABELS
Node1   Ready    <none>   11d     v1.23.9-eks-ba74326   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=t3.small,beta.kubernetes.io/os=linux,eks.amazonaws.com/capacityType=ON_DEMAND,failure-domain.beta.kubernetes.io/region=eu-central-1,failure-domain.beta.kubernetes.io/zone=eu-central-1a,kubernetes.io/arch=amd64,kubernetes.io/os=linux,node.kubernetes.io/instance-type=t3.small,topology.kubernetes.io/region=eu-central-1,topology.kubernetes.io/zone=eu-central-1a
Node2   Ready    <none>   21d     v1.23.9-eks-ba74326   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=t3.small,beta.kubernetes.io/os=linux,eks.amazonaws.com/capacityType=ON_DEMAND,failure-domain.beta.kubernetes.io/region=eu-central-1,failure-domain.beta.kubernetes.io/zone=eu-central-1a,kubernetes.io/arch=amd64,kubernetes.io/os=linux,node.kubernetes.io/instance-type=t3.small,topology.kubernetes.io/region=eu-central-1,topology.kubernetes.io/zone=eu-central-1a
Node3   Ready    <none>   21d     v1.23.9-eks-ba74326   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=t3.small,beta.kubernetes.io/os=linux,eks.amazonaws.com/capacityType=ON_DEMAND,failure-domain.beta.kubernetes.io/region=eu-central-1,failure-domain.beta.kubernetes.io/zone=eu-central-1a,kubernetes.io/arch=amd64,kubernetes.io/os=linux,node.kubernetes.io/instance-type=t3.small,topology.kubernetes.io/region=eu-central-1,topology.kubernetes.io/zone=eu-central-1a
Node4   Ready    <none>   5d12h   v1.23.9-eks-ba74326   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=t3.small,beta.kubernetes.io/os=linux,eks.amazonaws.com/capacityType=ON_DEMAND,failure-domain.beta.kubernetes.io/region=eu-central-1,failure-domain.beta.kubernetes.io/zone=eu-central-1b,kubernetes.io/arch=amd64,kubernetes.io/os=linux,node.kubernetes.io/instance-type=t3.small,topology.kubernetes.io/region=eu-central-1,topology.kubernetes.io/zone=eu-central-1b
Node5   Ready    <none>   5d13h   v1.23.9-eks-ba74326   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=t3.small,beta.kubernetes.io/os=linux,eks.amazonaws.com/capacityType=ON_DEMAND,failure-domain.beta.kubernetes.io/region=eu-central-1,failure-domain.beta.kubernetes.io/zone=eu-central-1b,kubernetes.io/arch=amd64,kubernetes.io/os=linux,node.kubernetes.io/instance-type=t3.small,topology.kubernetes.io/region=eu-central-1,topology.kubernetes.io/zone=eu-central-1b
Node6   Ready    <none>   21d     v1.23.9-eks-ba74326   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=t3.small,beta.kubernetes.io/os=linux,eks.amazonaws.com/capacityType=ON_DEMAND,failure-domain.beta.kubernetes.io/region=eu-central-1,failure-domain.beta.kubernetes.io/zone=eu-central-1b,kubernetes.io/arch=amd64,kubernetes.io/os=linux,node.kubernetes.io/instance-type=t3.small,topology.kubernetes.io/region=eu-central-1,topology.kubernetes.io/zone=eu-central-1b
Ruchita Sheth
  • 840
  • 9
  • 27
  • 1
    Please consider providing a Minimal Reproducible Example (https://stackoverflow.com/help/minimal-reproducible-example) so others can replicate your issue and provide better responses. – Blender Fox Apr 12 '23 at 08:28

1 Answers1

3

Check what is the maximum pod capacity for the current node instance type. In the cluster each node has a maximum capacity for scheduling node. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html#AvailableIpPerENI

jacorl
  • 396
  • 2
  • 9
  • I am using t3.small. what is your suggestion.. should I upgrade to t3.mediumm or should I add one more node of type t3.small? – Ruchita Sheth Apr 12 '23 at 09:37
  • 1
    Really depends on your use case and your configuration. For fault tolerance the best practice is to use more than one node in different AZ. Instead if the project has no requirements you can save your time and just upgrade the instance family. Remember that the price is the same 2xt3.small = t3.medium – jacorl Apr 12 '23 at 13:24