3

When I'm creating EKS cluster with single nodepool using terraform, I'm facing the kubelet certificate problem, i.e. csr's are stuck in pending state like this:

NAME        AGE     SIGNERNAME                      REQUESTOR          REQUESTEDDURATION   CONDITION
csr-8qmz5   4m57s   kubernetes.io/kubelet-serving   kubernetes-admin   <none>              Pending
csr-mq9rx   5m      kubernetes.io/kubelet-serving   kubernetes-admin   <none>              Pending

As we can see REQUESTOR here is kubernetes-admin, and I'm really not sure why. My terrafrom code for cluster itself:

resource "aws_eks_cluster" "eks" {
  name     = var.eks_cluster_name
  role_arn = var.eks_role_arn
  version = var.k8s_version
  vpc_config {
    endpoint_private_access = "true"
    endpoint_public_access  = "true"
    subnet_ids  = var.eks_public_network_ids
    security_group_ids  = var.eks_security_group_ids
  }
  kubernetes_network_config {
    ip_family = "ipv4"
    service_ipv4_cidr = "10.100.0.0/16"
  }
}

Terraform code for nodegroup:

resource "aws_eks_node_group" "aks-NG" {
  depends_on = [aws_ec2_tag.eks-subnet-cluster-tag, aws_key_pair.eks-deployer]
  cluster_name  = aws_eks_cluster.eks.name
  node_group_name = "aks-dev-NG"
  ami_type  = "AL2_x86_64"
  node_role_arn = var.eks_role_arn
  subnet_ids    = var.eks_public_network_ids
  capacity_type = "ON_DEMAND"
  instance_types = var.eks_nodepool_instance_types
  disk_size = "50"
  scaling_config {
    desired_size = 2
    max_size     = 2
    min_size     = 2
  }
  tags = {
    Name  = "${var.eks_cluster_name}-node"
    "kubernetes.io/cluster/${var.eks_cluster_name}" = "owned"
  }
  remote_access {
    ec2_ssh_key = "eks-deployer-key"
  }
}

Per my understanding it's very basic configuration.

Now, when I'm creating cluster and nodegroup via AWS management console with exactly SAME parameters, i.e. cluster IAM role and nodegroup IAM roles are same as for Terraform, everything is fine:

NAME        AGE     SIGNERNAME                      REQUESTOR                                    REQUESTEDDURATION   CONDITION
csr-86qtg   6m20s   kubernetes.io/kubelet-serving   system:node:ip-172-31-201-140.ec2.internal   <none>              Approved,Issued
csr-np42b   6m43s   kubernetes.io/kubelet-serving   system:node:ip-172-31-200-199.ec2.internal   <none>              Approved,Issued

But here, certificate requestor it's node itself (per my understanding). So I would like to know what's the problem is here? Why requestor is different in this case, what's the difference between creating of these resources from AWS management console and using terraform, and how do I manage this issue? Please help.

UPD.

I found that this problem appears when I'm creating cluster using terraform via assumed role created for terraform. When i'm creating the cluster using terraform with regular IAM user credentials, with same permissions set everything is fine. It doesn't gives any answer regarding the root casue, but still, it's something to consider. Right now it seems like weird EKS bug.

vi7a
  • 719
  • 5
  • 10

1 Answers1

0

I had a similar issue to this, I created a cluster using an assumed role that I created separately and I was getting errors tls errors in the pod.

My resolution was that the role I created that was being used to assume to create the cluster did not have eks as a trusted entity. Once I modified the roles trust relationships to add in the eks service it was able to issue the certs.